RE: Information: Brad's reply (was Information: a very

Brad Jones (bjones@tartarus.uwa.edu.au)
Fri, 3 Jul 1998 16:35:52 +0800

Glenn,

> At 06:34 PM 7/1/98 +0800, Brad Jones wrote:
> >>
> >> At 01:19 PM 6/30/98 +0800, Brad Jones wrote:
> >> Greg expressed his opinion that macroevolution can't occur.
> He didn't cite
> >> andy data, I cited the last chapter of Gilbert's Developmental Biology.
> >> Why wouldn't you be curious enough to at least go look at that
> book rather
> >> than accept an opinion that agrees with your own as authoritative?
> >>
> >
> >I appreciated Greg's opinion because it was clear, concise and
> seemed to be
> >reasonable. I do not have time to research every point.
>
> This is one of the most fascinating things about the creation/evolution
> debate, at least to me. Far too often people refuse to go look at data,
> yet continue to be dazzled by the truth of their own opinion. I really am
> quite fascinated by this phenomenon which occurs quite often in the C/E
> debate.
>

What I was saying is that Greg answered my question clearly. you did not.
Therefore I prefer Greg's answer. What is wrong with that?

That is surely my preference as to which explanation makes the most sense to
myself?

> Brad wrote:
> >> >A DNA sequence of AAAAATAAAA will output this each and every
> >> >time eg: AAAAATAAAA AAAAATAAAA AAAAATAAAA
> >>
>
> I replied:
> >> If I take the above statement as being sequential then it is
> an assertion
> >> that DNA is repeated segments of
> >> AAAAATAAAAAAAAATAAAAAAAAATAAAAAAAAATAAAA..., which is clearly
> nonsense and
> >> observationally wrong. But if placed in a generational axis, the above
> >> statement makes sense. I was merely trying to give you the
> benefit of the
> >> doubt.
> >>
> >
> Brad wrote:
> >You are correct, that would be nonsense.
>
> Thank you. At least we can agree on one item.
>
> >But modeling DNA as a source is
> >nonsense anyway so whats the difference? No finite sequence of
> symbols is a
> >source.
>
> But I must strenuously disagree with the above. If NO FINITE sequence of
> symbols is a source, then, my good 3rd year student friend, there are NO
> SOURCES IN THIS UNIVERSE. Everything in the universe is finite
> and since no
> finite sequence can be a source, therefore there must not be any sources.
> By the way, there are only 10^80 particles in the universe so no sequence
> is longer than that and 10^80, while large is exceptionally finite!
>

ok, finite source may not be clearest way to express what I was trying to
say...

basically a source creates information. Therefore when we have a set of
symbols they are no longer new information to us. If we get sent them again
that is nothing new either. A source is something which is outputting
information we don't have, DNA does not so this.

You have also never stated how you model the actual random changes. ie
sources do not ever randomly change in information theory. Random changes to
data is ALWAYS done by a channel, sources create information, CHANNELS
change it.

A channel is something which data gets put into by something (may be a
source or another channel) and then read out of by something else. I think
this fits DNA much better.

DNA replicating is one channel to another. Same as copying a tape is
(Yockey's comparison not mine).

The information that is in the DNA is the big question?

Is that information random? no way. was it built by adding random noise? of
course not or it would be random itself.

>
> >
>
>
> >by sequential I was considering the proteins that are created by
> the DNA, ie
> >DNA specifies proteins and does so repeatedly. If this does not
> happen or I
> >am totally on the wrong track then I apologise, I am no biology
> expert as I
> >have said.
>
> Learning is what everyone must do on earth.
>
> >
> >I am here to debate information theory and your comments regarding the
> >creation of information. I disagree that "information can be created by
> >random mutations" independent of the biological mechanisms.
>
> Greg cited Shannon, Greg cited your own professor, I have cited Yockey. I
> don't think much will convince you. Tell your professor that he has been
> teaching incorrect things.
>
> >
> >If Yockey has relevant points then I assume you will bring them out.
>
> What do you think the quotations are. I type them in because I beleive
> them to be relevant, not because I like typing.

Hmm, well why did you not respond when I pointed out that Yockey considers a
tape recorder as a good analogy?

That must be relevant if you typed it in...

>
> >I am
> >looking for Yockey's book but it is not readily available.
>
> It is through interlibrary loan. Everything is available through
> interlibrary loan.

Hmm, know a lot about Australia's library system do we? Sure I may be able
to get it and I am still trying. It is not a common book and as such I don't
have it yet.

As it seems that you do not believe me (why I would lie about the
availability of a book is beyond me) Anyway goes to this url and search for
Yockey. It is the Library and Information Services of Western Australia.
http://innopac.liswa.wa.gov.au/search/a

Here you can see that Yockey's book is not in the state library system.
Happy?

>
> >I am not sure what this proves.
> >
> >I agree that information theory does not have to know the
> meaning. It sure
> >helps knowing that there is meaning though.
>
> No it doesn't. You mentioned that you could compress a seqeunce better if
> you knew it had meaning. This famous sentence by the linguist Chomsky has
> no meaning but is a perfectly good english sentence.
>
> 'Colorless green ideas sleep furiously.

hmm, the point is?

If it doesn't have meaning then there is no point sending it at all was what
I was saying. If the point of that sentence is "english doesn't have to make
sense" then that still conveys meaning about the english language doesn't
it?

Once again. Information without meaning is not considered as information
because there is no reason to tranmit it.

You *could* send garbage sure, but there is no point so why do it. that is
my point. the term "information" as I use it implies some informative value
to someone or something. Otherwise there is no point to it. Something which
has no point in engineering is to be avoided, nobody will spend money
transmitting something to which there is no point.

Same as DNA containing gibberish would not create cells.

>
>
> >
> >
> >>
> >>
> >> [of DNA brad wrote]
> >> >I thought the error rate was pretty small, around 10E-9 if I remember
> >> >correctly, that is easily small enough to compare to a CD.
> >>
> >> the error rate only applies to the generational axis. There
> is NO error
> >> rate in the sequential axis of a non-reproducing DNA sequence.
> >
> >if there is no error rate then there is nothing to discuss on
> the sequential
> >axis as far as information creation is concerned.
> >
> >We are discussing the information created by random mutations,
> if there are
> >no mutations then there can be no information created. Read quote below:
> >
> >***********************************************************************
> >>But in our original notes on information theory, both Brian and I were
> >>talking about the Sequence axis. Information is measured along the
> >>sequence axis, not per se the generational axis.
> >
> >>Thus when I pointed out that the sequence AAAAAAAAAA had zero
> information
> >>content, and the mutation to AAAAATAAAA represented an increase in
> >>information it does because we are not talking about the
> generation axis.
> >>But even putting it into your terminology, the
> output(generational axis) of
> >>the DNA sequence AAAAAAAAAA is not always AAAAAAAAAA but occasionally is
> >>AACAAAAAAA or AAAAATAAAA. There is a Generational markov matrix that is
> >>something like:
> > --Glenn, 28/06/98
> >
> >*****************************************************************
> *********
> >
> >So you are directly contradicting yourself here. This is why I have
> >previously (and still do) ask for you to clarify your position.
>
> No, I don't think I was contradicting myself. That discussion was
> concerning how to calculate the information content of a given sequence.
> We measure information along the seqeunce axis NOT the generational axis.
> And at that time I was understanding you to say that information was
> measured along the generational axis. That aside, the change in
> information
> content from AAAAAAAAAA to AAAAATAAAA is the same regardless of whether or
> not these two sequences are in a generational axis or not. Infact one
> doesn't even need to use the same symbols ********** and CCCCCTCCCC have
> the same information content as the two A sequences above.

Well, the result was that your position was and still is not defined.

My positon is "Random mutations cannot increase information in DNA"

That can only be argued on the generational axis because that is when DNA
itself changes. You yourself stated that there are not mutations on the
sequential axis so we must be talking about the generational axis.

If that is where the changes occur then that is where the information must
be created.

That was also where my comparison with the CD (or I'll use the tape recorder
as Yockey does if you like) come in.

Copying a tape and introducing random errors will always REDUCE the
information transfer between tapes.

Therefore using Yockey's comparison we get:

DNA replicating with random mutations will always reduce the information
transfered to the new DNA.

Do you still disagree with this?

>
> >>
> >> >
> >> >ANY information storage device IS a channel. Do not make the
> mistake of
> >> >thinking anything that has an output is a source. This is not correct.
> >> >
> >>
> >> I think there is some difference in terminology as noted above.
> >
> >I think not
>
> And when you do this with Greg,Brian and myself, this is why no progress
> can be made.

That is what I think, so that is what I said...

I do not see why we will not make progress, it is an interesting topic and
discussing it can only be a good thing.

>
> >I agree, Information does not have to be english language or
> anything else
> >like that. In fact information can be almost anything that is created to
> >convey some kind of meaning. Random noise is not information.
>
> It is interesting that we keep quoting text books to you and you keep
> saying 'wrong' without quoting any texts to back up your position. I had 3
> or 4 Yockey quotations backing up what I said in this post alone. You
> never cite any authority except your authority.

Hmm, you have never produced a quote from a recognised authority saying
"random mutations increase the information stored in DNA" therefore all your
quotes are subjected to interpretation when looked at in light of this
question. I simply interpret them differently, we both have our opinions on
what the author was saying.

>
> >
> >
> >>
> >> "For this reason we will call H the message enropy of the ensemble of
> >> messages. It is also called the information content since the number of
> >> messages of length N is 2^NH. Each message can be assigned a
> meaning or
> >> specificity and thus carries *information*[italicized to
> differentiate it
> >> from the previous use of information--grm], knwoledge or
> intelligence. H
> >> reflects the character of the probability distribution P[i]. A broad
> >> distribution provides a maximum choice at the source or
> uncertainty at the
> >> receiver. H is a maximum if all p[i]=1/n." ~H. P. Yockey, "An
> Application
> >> of Information Theory to the Central Dogma and the Sequence
> Hypothesis,"
> >> Journal of Theoretical Biology 46(1974):369-406, p. 373
> >>
> >> I repeat the last sentence.
> >>
> >> " H is a maximum if all p[i]=1/n" This means that the
> probabilities of the
> >> characters are RANDOM, RANDOM RANDOM. H is maximum if the
> >> sequence is RANDOM.
> >
> >NO NO NO.
>
> Maybe you should be the editor at Cambridge University Press.
> They thought
> YES, YES, YES.

Having probabilties gained from statistical analysis does not make the
source random. I can get probabilties of symbols in your writings but that
doesn't make them random does it?

equiprobable symbols do not have to be random, they just occur with the same
frequency. They do have the potential to be carrying the most noise, but if
they are truly random then it is noise nothing else.

once again confusing models with the real thing.

>
>
>
> >> >> That is why you can't tell whether I am writing real mandarin
> >> >> chinese (pinyin) below or real gibberish.
> >> >>
> >> >> Ni xue yao xuexi hen duo!
> >> >
> >> >No I cannot tell, but with a bit of investigation I could.
> >>
> >> Do it without consulting a chinese student or a chinese textbook.
> >> Go ahead?
> >
> >Why on earth would I do it without consulting a textbook?!?
>
> Because on Tue, 30 Jun 1998 13:19:49 +0800 you wrote:
>
> >No. Information theory is concerned with transmitting the MEANING in as
> >efficient a manner as possible, therefore it would treat the word
> >differently depending on what language and context it was used it.
>
> If information theory treats a word differently depending on
> what language
> and context was used, this strongly implies that you must be able to tell
> that there is meaning in a sentence. If you can't do that, then you can't
> treat the sequence differently.

That is true. If you cannot investigate the meaning then you treat it
accordingly. However there are other methods of investiagting meaning than
from the message as I said.

In the case of the question in debate the fact that the source was truly
random, and hence has no meaning was given by yourself. Therefore we already
know it doesn't have meaning and have no need to analyse the message.

Also I never said that we HAD to know the meaning. I said that more
efficient systems can be made when more is known about the source.
(including whether or not it is just noise)

>
> And on 1:19 PM 6/30/98 you wrote:
>
> >No I cannot tell, but with a bit of investigation I could. Once
> I know which
> >one it is I would be able to find the true information content.
>
>
> The seqeunce "Ni xue yao xuexi hen duo!" in English means nothing. In
> Mandarin chinese, means "You need to study much more"

thats nice

>
> >
> >> Zhe ge mao you mao.
>
> this translates as 'That cat has a hat'

great

>
> >>
> >> xi gong zuo chi xiao xue.
>
> This is gibberish. If you as an engineer were to try to transmit the two
> sentences, you would have to treat them the same because you were
> not privy
> to the mandarin meaning. This is why meaning is irrelevant in information
> theory.

You are correct. But because I "had" to treat them the same because I didn't
know the difference does not mean that I cannot do it better if I did know
the difference.

Eg If I knew mandarin was to be transmitted I could design a system
accordingly. If it was noise to be transmitted I would ask "why the hell do
you want noise?"

>
>
> >> If information theory is about meaning tell me the bad word!
> >> lao wu gui
> >>
> >> hen xiao chun
> >>
> >> jiu dian zhong
>
> for the record lao wu gui means 'old turtle' and it is a terrible insult!
>
> >> I wrote:
> >> >> you didn't know ...
> >> >> using mathematics for a non-degenerate code;
> >> >
> >> >I know that the code is not important in the maths I used.
> >>
> >> I cite Yockey again. "The third term in equation (7) is one of
> the aspects
> >> of information theory in biology which differes from
> infomration theory in
> >> elelctrical engineering. This is because there is no degeneracy in the
> >> codes used in communications." ~H. P. Yockey, "An Application of
> >> Information Theory to the Central Dogma and the Sequence Hypothesis,"
> >> Journal of Theoretical Biology 46(1974):369-406, p. 37
> >>
> >> Can you cite the 3rd term in equation 7 from electrical
> engineering texts?
> >
> >umm, what are you talking about here? what is the "3rd term in
> equation 7"?
>
> Precisely my point. You haven't taken the time and trouble to familiarize
> yourself with the mathematical differences between info theory in EE and
> info theory applied to biology. If you don't know this, then you don't
> know the field. Period!!!!

hmm, what does my textbook say on line 4 page 57?

well If you don't know then obviously you need to research info theory
more...

Tell me what the equation is and then I might know what the hell you are
talking about here. I do not have Yockey's book so asking me questions about
part of it serves no purpose whatsoever.

>
> >I believe you misunderstand the textbook and are confusing the use of
> >mathematical models with the actual information source.
>
> Me, Greg, Brian, Yockey, Shannon, and now your own professor.

No. you Greg and Brian maybe.

>
> >
> >
> >>
> >>
> >> But if you can tell me which of the chinese statements are
> meaningful, I
> >> will tell you their meaning. I don't expect to hear from you on this.
> >> glenn
> >
> >
> > I'm not going to bother because it is not relevant. I never
> stated that I
> >could tell information from gibberish from the message alone. I
> stated that
> >I could investigate the source to tell is it is information. Big
> differences
> >here.
>
> OK, investigate the source (me). What is interesting is that your
> statement
> above about needing a chinese textbook to determine the meaning, requires
> that you step OUTSIDE of information theory to determine meaning. That is
> an implicit admission on your part that information theory is unable to
> determine a meaningful sentence from gibberish.

I have already agreed that info theory cannot tell meaning from gibberish.
And yes more than info theory is needed to build things. Why would anyone
limit themselves to one particular subject when dealing with REAL systems?

Real systems are real and I treat them that way.

> This is also an
> example of
> why meaning plays no role in information theory. I am the source of the
> sequence below.
>
> Without stepping outside of information theory tell me if this
> has meaning?
>
> @85y975 w53008ht 975w8e3 9r 8hr94jq589h 5y3946 53oo j3 8r 5y8w
> yqw j3qh8ht'
>

Same as your mandarin example, what does this prove?

I have NEVER stated that information theory finds the meaning of something,
that is clearly absurd.

>
> Brad, you can have the last word in this. I will not post again on this
> topic with you because there is no point in boring our readers with a 'sez
> so- sez not' exchange. The only thing I will answer is the puzzle in the
> above paragraph. You seem to think that as a 3rd year student you are
> correct when even your professor says that a random sequence has the most
> information. This is a serious problem on your part. But you get the last
> word as far as I am concerned.
>

I have always agreed that a random source has the capacity to contain the
most infomation. A one line response to an unknown question from Dr Togneri
hardly shows anything. Plus I still consider the modeling of DNA as a source
to be wrong, you still have never said why it better to use a source than a
channel.

If a source is creating information and it *appears* random then it may very
well be producing maximal information, but if it produces gibberish then it
is better modeled as a noise source not an information source. After all,
that is exactly what a noise source is.

The question is about whether or not random mutations increase information
in DNA. I think I have put up a good case as to why they do not.

Random mutations will always reduce the information transfer between
generation of DNA. This is immediately obvious if you use the tape recorder
(or CD) comparison.

Why the discussion is not going anywhere is only because you insist on using
information as meaning "information carrying potential". If you assume that
DNA contains information then that information does get reduced by
mutations. If the mutations are random then they are best modeled as noise
and information theory will clearly show that noise will reduce the
information content.

I have always been treating DNA as a real device that contains real and
valuable intructions on how to build the most amazing devices I have ever
come across. Your claims that random changes to these instructions will
produce better instructions to build better devices is clearly wrong. It is
exactly the same as saying that blueprints for a plane would be improved
(contain more information) if they were subject to random changes (maybe
while they are stored on a tape machine). Planes are not built by random
gibberish and neither are biological systems (which are far more complex)

You consistantly ignore the fact that the information contained in DNA has
specific meaning to the contruction of a cell and that any random changes
will destroy this information. That is proveable by info theory and also
pretty damn obvious if you ask me.

Information considered by itself without reference to meaning may be a
useful tool or a learning aid but we are talking about a REAL system that
has meaning. Therefore any engineer will see that the meaning is important
and cannot be ignored.

How would you like it if your computer suddenly decided that random noise is
a more efficient source of information than your writings. In fact if you
posted random gibberish to this mailing list you would be (by your argument)
be providing us with more information. Do you see the fallacy yet of
applying this narrow and simplistic viewpoint to a real system?

The argument was never about whether DNA could potentially store more
information if it was random. It is about whether or not random information
would have more meaning in the construction of cells. That is why
information theory must consider meaning (not necessarly know it, just
consider it) when dealing with real systems.

If your position all along was "random mutations add information carrying
capacity to DNA but stuffs up totally its ability to build living cells"
then we are in total agreement. Otherwise if you still think that "random
mutations turn valuable information into gibberish but this leads to the
creation of everything we see in nature" then we will just have to agree to
disagree.

Since you have nothing further to say I would like to thank everyone who
participated in this discussion as I have enjoyed it greatly.

--------------------------------------------
Brad Jones
3rd Year BE(IT)
Electrical & Electronic Engineering
University of Western Australia