RE: Information: Brad's reply (was Information: a very

Brad Jones (bjones@tartarus.uwa.edu.au)
Wed, 1 Jul 1998 18:33:50 +0800

>
> Brad,
>
> > >
> > > 1. DNA is a channel
> > > Random mutations will *decrease* the mutual information between
> > > source and sink.
> >
> > Anything that stores inbformation is a channel. Books, CDs etc.
> As such DNA
> > seems to fit into this category quite well. A channel can output
> > information, but this does not make it a source of information.
> Easy mistake
> > to make.
>
> The question is whether it is meaningful to consider DNA a channel. This
> is a question to be answered theoretically, not by checking definitions.
> (BTW, most channels considered in information theory are memoryless, as
> Glenn has been saying. And channels don't 'output' information;
> information
> comes *through* them.)

You mean does it help your position to consider DNA as a channel. Not is it
meaningful to consider it as a channel. DNA is a device which contains
stored information, therefore it is meaningful to consider it as a channel.
(even if it doesn't help your position)

>
> > > 3. DNA as source
> > > Random mutations will *increase* the information content
> of the source.
> >
> > This is just the wrong way of looking at it. My corrections to
> Glenn's use
> > of information theory on this topic should show that. As nobody tried to
> > correct my maths then I assume that Glenn now sees that his use of the
> > simplistic formula was incorrect?
>
> The problem is that you are considering DNA a channel. If you aren't
> willing to reconsider that (as well as firming up your language a bit),
> we'll probably have to declare the discussion over.

I will not reconsider looking at DNA as a channel because it is one. On the
other hand if you pretend it is a source you can easily find that it has
zero information (which is what any channel will turn out to be).

>
> [more disagreements about same]
>
> The application of what you are talking about *does* exist. That is,
> if we want to know what dino DNA looked like, then the dino DNA is
> the source, and modern descendants of dinosaurs (as well as their
> relatives) are the channel through which we have to 'receive' the
> information about the original dino DNA. The errors in transmission
> decrease the mutual information between what we see in some strands
> now and the dino DNA, which is why it is probably hopeless to do
> a reconstruction a la Jurassic Park (amid other equally or more important
> reasons).
>

A source *creates* information. DNA does not create information, it stores
it. You were arguing that the information is created by random mutations
were you not? By your own argument random mutations are the source and DNA
is the channel. why is this so hard?

> Everyone already knows this, and it should be clear that this is a
> special case. It is appropriate to consider intervening DNA a channel
> because we're interested in ancient DNA sequences. This is basically
> never true "in the wild," which is why info theory is never applied in
> that way there.
>
> > I want to repeat here that sources are NOT random. They are deliberate
> > meaningful information. This seems to be a major misunderstanding here.
> >
> > Information theory does not ascribe meaning to a source, but it
> definitely
> > ASSUMES that it is meaningful. The purpose of information theory is
> > therefore to transmit the MEANING of the data in as efficient a
> manner as
> > possible.
> >
> > Of course random noise has no meaning and so the most efficient
> manner of
> > transmitting it is just not to....
>
> Sorry, but this is just absolutely wrong. I don't know of a better way
> to say it.
>
> > the question is: Does noise have more information than any other signal?
>
> A noise source will generate maximal information, yes.

No. A noise source generates no information. A source that has equiprobable
symbols does generate the most information, but it is not random. do you get
this yet?

For example a source coding technique will even up the appearence of symbols
in the

>
> > Glenn (and others) previously stated that it had maximal
> information. That
> > means that they agree with the above statement (The most
> informative source
> > is noise).
> >
> > Do you want to the Lecturers response to this question? (Dr
> Roberto Togneri,
> > http://www.ee.uwa.edu.au/staff/togneri.r.html/)
>
> Sure, why not.
>
> Do you want to hear what an old guy named Shannon had to say on the topic?
> >From his original paper in the second paragraph (for goodness' sake):
>
> The fundamental problem of communication is that of reproducing at
> one point either exactly or approximately a message selected at another
> point. Frequently the messages have meaning; that is they refer
> to or are correlated according to some system with certain physical
> or conceptual entities. These semantic aspects of communication are
> irrelevant to the engineering problem. The significant aspect is that
> the actual message is one selected from a set of possible messages.
> The system must be designed to operate for each possible selection, not
> just the one which will actually be chosen since this is unknown at the
> time of design.
>
> ...and from page 5...
>
> We can think of a discrete source as generating the message, symbol
> by symbol. It will choose successive symbols according to certain
> probabilities depending, in general, on preceding choices as well as
> the particular symbols in question. A physical system, or a
> mathematical model of a system which produces such a sequence
> of symbols
> governed by a set of probabilities, is known as a stochastic process.

Well, I agree 100% with Shannon here.

When he talks about probabilities of symbols he does not mean they are
randomly generated. This is what you do not seem to understand. Sure you may
model what will be sent by a random source, but this is a TOOL for building
systems, you don't actually ever send random data.

If you know that random data will be sent you don't bother building
anything....

What he says about meaning is also 100% correct (as you would expect... ).

He is saying that we don't have to know what the english text says (for
example) but we sure as hell know it isn't random noise as opposed to
english text. Do you get this? english text can be modeled by symbol
probabilties, but that does not mean that what is transmitted was generated
by a random process.

What I write can be analysed to give the probabilties of what I'll say next,
then a random source can be written to model how I write. This could then be
used to design the optimum transmission system for my words. BUT the whole
point is to tranmit my words, not the random symbols. DO you get this yet?

The random source DOES NOT have information. But if we assume it is myself
then we can pretend it does and use this to design a system.

That is what symbol probabilties are for; To model real systems. MODEL MODEL
MODEL. They are NOT that actual information source. They are a TOOL for
designing the systems that will deal with the REAL information.

If you want to know the capacity of a channel then sure, hook up a random
source, this will test the capacity. BUT in doing this you are PRETENDING
that the random source is putting out information.

As I said earlier, if the random source is actually what you want to
transmit (maybe you guys would?) then you would analyse the random source
and find out "Gee, this thing is just noise" then you would model the
information in that by NOTHING then design a model infosource which does
NOTHING.

Another point is that equiprobable random sources are used to MODEL real
sources because equiprobable symbols are desirable for efficiency, but that
means that the input data is coded to be equiprobable. It does not mean that
equiprobable random data is sent.

>
> So, it would appear that the source materials (math hasn't changed in the
> last 50 years, BTW), indicate that information theory is to be applied
> *without worrying about the meaning of the information*, and information
> sources are to be considered *stochastic processes* ('stochastic' means
> 'random'; look it up).

Dead right. I 100% agree. I have been stating all along that info theory
does not worry about whether the message says "Hello" or "Goodbye". But it
does care that is says SOMETHING.

>
> I'm still interested in what your professor has to say on the subject,
> though, since it may give us some understanding of where you're coming
> from.

Quote from Dr Togneri's written answers to 1997 exam paper:

"Disagree! Info theory is not nonsense. The person listening to radio static
is nonsense. Info theory quantifies the information content of a source, it
does not ascribe any meaning to the source"

I hope you will think long and hard about this statement. It fits both what
I have been saying and also your quotes from Shannon.

It means that information theory does not care what the meaning is, but
there must be meaning there. If there is no meaning there then to transmit
that meaning you don't have to send anything. ie noise can be represented by
zero information.

I'll repeat this again in the hope that somebody understands it:

Engineers DO NOT create devices to transmit random bits, they do it to
transmit meaning. This is the whole point. Reduce the meaning to the most
efficient representation and transmit it with no errors.

If random models are used to model the actual source then that is fine, but
the systems are not designed to transmit random data because it has no
value/information/meaning.

Incidentally this is way off track with respect to the creation of
information which was what was originally being debated.

--------------------------------------------
Brad Jones
3rd Year BE(IT)
Electrical & Electronic Engineering
University of Western Australia