Probability (Was Re: Ken Ham (help))

GRMorton@aol.com
Sun, 18 Feb 1996 15:56:32 -0500

In a message dated 96-02-18 10:18:37 EST, you write:

>To my way of thinking, the biggest argument against Darwinian evolution >is
the sheer mathematical improbability of it occurring, even given untold
>billions of years. You don't even need to invoke Genesis to successfully
>argue this point. Then, when you have cast doubt in the mind of the
>Darwinian evolutionist, at that point you invoke a literal Genesis to make
>your point about a Creator God. To counter Darwinian evolution >arguments
with an OEC viewpoint might not have the impact that a YEC >argument might
have, because you then have two opposing old earth >viewpoints battling with
one another. The YEC arguments are thus more >dramatic and 'contrasty'.
>
>I guess the problem (or task) then becomes...how do OECs counter the
>Darwinian evolutionary ideas just as dramatically, convincingly, and
>forcefully...and as successfully, as have the YECs.

I always get myself in trouble when I post here, but I could not let the
probability issue pass by unchallenged. I used to believe the line that the
random formation of a given protein was highly unlikely--that is, until I
worked on the problem. We can not challenge Darwinian evolution based upon
faulty logic and faulty math. Here is a post I put on another list and it
shows exactly how one can produce by random means a sequence which can
perform a specified task.

glenn
Post below

ABSTRACT: The probability argument against the random finding of
a given sequence is one of the main stays of the anti-
evolutionary position. I have noted before that I view that
argument as a weak one for a variety of reasons. In this note I
will show that the finding of a functional sequence by a random
search is quite likely on normal evolutionary time scales.
Because of this, and other weaknesses in the traditional
apologetic, Christianity needs to move to a more defendable
apologetic.

Duane Gish once wrote:

"The highly specific biological activity of each protein is due
to the precise way the amino acids are arranged, just as the
information conveyed by this sentence is determined by the
precise sequence of 190 letters found in it."~Duane Gish, "The
Origin of Life," Proc. First Int. Conf. on Creationism,
Pittsburgh: Creation Science Fellowship, 1986, p. 62

There is a major problem with that sentence. This is not the
only way to state what Gish wanted to state. For instance, he
could have written "Biological activity is due to very specific
orderings of amino acids as this sentences meaning is due to the
123 letter order."

This is only a hint of how much variability there is in sequence
space in order to convey the same message. There is an amazing
flexibility in the language to perform the same task. I once
calculated and listed over 330,000 ways to convey the
information, "If you pick your nose; you get warts." These vary
from relative pigeonish phrases like "pick nose get wart" to more
complex statements, "If you put your digits into your nares, you
will contract a hypertrophy of the corium." There are various
orders of this statement. It can be reversed. "To contract a
hypertrophy of the corium, place your digits into your nares."
But you can substitute nasal openings, nostrils, nasal passages,
for nares. You can get more gross and talk about what you pick
and extract. :-) All of sequences were less than 80 in length
and I only quit calculating because my imagination played out and
I was getting bored.

So the question is, if I wish to convey a certain message, how
likely is it that I can find a sequence to perform a given
function? There is a way to randomly produce a useful sequence
which is not all that improbable.

Let's use a less gross example than the nose picking one above.
Lets find a functional sequence to answer the question your wife
asked you when you were first married. "What do you want for
breakfast?" (and you thought I was going to say something else.
tsk tsk.) There are lots of ways to answer this question. What
we will do is choose a 70 unit long sequence of 20 letters,
ruling out the use of z,q,x,k,v and j. Thus, we have in this 70
unit long sequence 1.18 x 10^91 different possible combinations.
Normally the anti-evolutionists say, like Gish, that the
likelihood of finding just the correct sequence is too unlikely
to occur. This is usually based upon the idea that one and only
one sequence will perform the task. This is untrue as we have
seen.
Even finding 330,000 ways to say I want eggs, does not solve the
problem. 330,000 ways to say I want eggs out of 1 x 10^78 is
still too improbable for one to consider realistically.

In order to solve the problem we need one other factor. What is
the shortest sequence which performs the function? The shortest
I can think of is simply "eggs". But this is not a full sentence
and would be too brusque for your bride. So lets say the
shortest sentence is "I eat eggs" without the spaces this is a 8
letter sequence.

What I noticed was that with a 2 unit long sequence, i.e., in a
2-d phase space, the sequence ab occurs at only one point out of
the 26 x 26 points in a 26 character set. That is 1/676=.0014. If
you embed this 2d space into a third (e.g. using a 3 unit long
sequence), there are then 52 permutations with the sequence ab.
There are 26 sequences *ab and 26 sequences ab* for a total of 52
sequences in the phase space.[The asterisk is a wild-card]
Thus the odds of finding a sequence with ab is 52/17576=.0029, a
considerable improvement in the odds of finding ab. Embedding the 2d sequence
in a 4d space requires **ab,*ab*,ab** be the sequences desired.(here * is
wildcard standing in for any letter) .
There are 3 x 26^2 in the 4d sequence and thus the odds are .0044
of finding an ab. Each subsequent embedding raises the odds of
finding a particular short sequence.
It would appear that the equation ought to look something like:

prob=(N-n+1)(L^(N-n)/L^N

where N is the number of dimensions in the larger phase space, n
is the number of dimensions in the smaller phase space and L is
the number of characters which can be selected. This equation
ignores those sequences which have multiple copies of the desired
embedded sequence, but they are a small quantity by comparison
and can be safely ignored.

Thus in the search of a 70-d space for a 8-unit sequence ("I eat
eggs"), should yield

prob =(70-8+1)(20^(62))/(20^70)=2.4 x 10^-9

This is the probability that you will randomly make a 70 unit
long sequence which contains the string "ieateggs" somewhere in
it. But one can object that this embedding of the wanted string
in another one makes it unlikely to be useful. After all, the
string

"fieuoindhgeosyhdbflgdsyfgshsdfgdfosuieateggsqcrpflacyebfmcpdusmw
gcnmle"

does not seem to convey much information. But, as is often noted
in discussions of the origin of protein or DNA sequences, once
formed the sequence is likely to be cut randomly. So what are
the odds that a sequence with "ieateggs" will be cut twice, at
just the correct location? If we consider that a sequence that
is not cut is equivalent to cutting it past the terminal
character of the sequence, there are 71 places you can cut the
sequence. Thus for the above sequence, randomly cut, there is a
1/(71*71)= 1/5041 chance of cutting it in such a fashion that the
"iwanteggs" statement is extracted. Thus the total probability
of finding a useful sequence in the 70 unit long sequence is 4.76
x 10^-13.

How likely are we to find this useful sequence? If we were to
assign amino acids to the letters, and write this sequence in
proteins, and then create a vat with 10^14 70-amino acid
proteins, (This is an average sized vat produced in university
laboratories today.) you would most likely find 10 of the
"ieateggs" sequence in the first vat.

This is not all. The next shortest useful sequence to answer
your bride's question is "I want eggs" This is a nine character
sequence The odds of finding and cutting out this sequence
in a 70-unit long sequence is 2.40 x 10^-14. In your first
vat of proteins there is a high probability that one "iwanteggs"
will be found. But there is also the phrase "I like eggs"
which is also 9 and has a probability of 2.40 x 10^-14 of being
in the vat after each sequence is cut twice. There is also,
"I need eggs", "I wish eggs" and "I have eggs".

If we look for 10-sequence solutions, we have "I covet eggs", I
crave eggs", "I fancy eggs", "I favor eggs" Each of these has a
probability approximately 10^-15. You would be likely to find
one of these in the first 10 vats.

In addition to these, if we go to an 11-length solution, we have
phrases like "I ingest eggs" "I devour eggs", "I fancy eggs", "I
gobble eggs". These have a likelihood of 10^-16.

This can go on and on. Within the 70-d space there are hundreds
of thousands of ways of saying that you want eggs for breakfast.

One question which can be addressed here is how can a short
useable sequence become longer. Well, if you come down to
breakfast and say brusquely to your bride, that "I eat eggs", she
might cook them for a few days but eventually she will demand a
politer response, like "Dear, I eat eggs". Small additions from
one useable form to another due to selection pressure caused by
your hunger pangs when your bride doesn't fix your breakfast, can
eventually lead you to say, "My beautiful wife, I am most
desirous of eating two eggs this morning" Obviously this
sequence has a greater functionality than simply, "I eat eggs".
But today this greater functionality is what we observe and expect to be
produced in the first attempt.

Do proteins act in the same fashion as the language above? Yes.
Gerald Joyce is one of the leaders in the field of directed
evolution. He noted that about 1 in a million of his sequences are capable
of performing the function he was looking for. This is a far cry from the 1
chance in 10^200 normally cited by antievolutionists.
I would point you to Discover, May 1994, "Speeding
Through Evolution,", and to Gerald E. Joyce, "Directed
Evolution," Scientific American, Dec. 1992, pp.somewhere around
p. 94,95 or Beaudry and Joyce, Science, 257:637-638, 1992.

Sean Eddy of the Washington University School of Medicine
recently wrote on Talk Origins,( message
<EDDY.95Aug17084136@wol.wustl.edu>) that RNA sequence space is
teeming with interesting functionalities. All based upon Joyce's
work.

Thus, the weaknesses in the traditional creationist probability
argument is two fold. It assumes that one and only one sequence
can perform a given function. And secondly, it assumes that only
the most complex forms must be made at first. This ignores the
potential of short sequences performing the same function."

When one adds this weakness to the other weaknesses mentioned
over the past few weeks the weakness of our apologetical approach
becomes obvious. The problems are: 1) the amount of genetic
variability in humans which requires an ancient humanity in order
to fit the Biblical data. 2) The inability for young-earth
creationists to account within their time frame for how the caves
could be formed in which fossil man lived. 3) The fact that
fossil man apparently built religious altars of various forms
which is unaccounted for by those defending a recent origin of
Adam. 4) The inability of old earth creationists to point to a
place and a set of rocks to explain how the flood occurred and
how it matches the Biblical account (how could Noah float for a
year and land anywhere near mountains?). 5) Whether one accepts
the fossils we discussed in June and July as truly transitional
or not, is less important to the apologetical case than what
those fossils appear like. If they have the appearance of being
transitional forms, all our pleading that these are really NOT
transitional forms will fall on deaf ears.

The young earth creationists position Christianity in opposition
to almost every piece of observational data science collects,
from astronomy, biology, geology, paleontology, physics and
anthropology. The PC and TE positions, with a recent creation of
man, are much better, but they place christianity in opposition
to certain biomolecular data(MHC and other allelic diversity) and
anthropological data (the nature of fossil man) as noted above.

It is very obvious that the positions we are defending
apologetically, are not very secure.

The question those interested in Christian apologetics and the
relation between science and the early chapters of Genesis should
ask themselves, is whether the purpose of the Christian apologist
is to explain the observational data in a Biblical framework or
to explain the data away. These are two very different
approaches. But if the probability argument against evolution is
as weak as I showed above, Christianity had best find a better
way to handle the area of Science and the Bible.

glenn
16075 Longvista Dr.
Dallas, Texas 75248