Random origin of biological information

From: pruest@pop.dplanet.ch
Date: Fri Sep 22 2000 - 07:51:34 EDT

  • Next message: George Andrews Jr.: "Re: Geocentricity"

    Glenn Morton wrote (in part):

    >Date: Wed, 20 Sep 2000 14:06:26 -0500 (CDT)
    >From: mortongr@flash.net
    >Subject: Random chance brings meaning
    >...
    >We are going to test these ideas, that random sequences can't create
    >information. And if genes are like words and sentences as Kenyon and Davis
    >claim, then I will show that random sequences CAN create information.

    Glenn presented the Vignere code, an encryption-decryption method, to
    demonstrate that random processes "CAN create information". He shows
    that a
    21-letter message encoded by a random 21-letter key can be "decoded" by
    9
    other random 21-letter keys to yield 9 different meaningful messages.

    In fact, it is quite easy to obtain such solutions: select a meaningful
    21-
    letter phrase, take its first (second,...) letter, locate it in the
    first
    row of the Vignere code table, go down this column to the first
    (second,...) letter of the coded message and look up the first letter in
    this row: this is the first (second,...) letter of the new "random" key.
    Repeat this for the 21 letters.

    For instance, take the message 'RandomOriginOfEnzymes': the procedure
    yields the key 'yeslsxvafodpqduiwwqtt'. Apply this "random" key to
    decipher
    Glenn's original coded message 'pefogjjrnulceiyvvucxl' and you'll obtain
    'RandomOriginOfEnzymes'!

    But of course, that's cheated, because we worked backwards!

    There are 26^21, or about 5.2 x 10^29 (that's 520,000 trillion
    trillion),
    different 21-letter strings of 26 possible letters. How many meaningful
    phrases of 21 letters might there be? 1000? a million? a trillion? I
    don't
    know. I haven't written a computer program to try to get an estimate.
    The
    "natural selection" routine required for this program must be quite
    involved, including a parser, a dictionary, some expert system
    algorithms,
    as well as a user-friendly interface for a human to evaluate the
    tentative
    solutions proposed by the program. But maybe Glenn, who certainly did
    not
    cheat, can provide us with such an estimate. What's your hitting
    average,
    Glenn?

    Manfred Eigen, Nobelist and inventor of the hypercycles, also cheated by
    working backwards. In popular lectures about the origin of life, he used
    to
    present a computer simulation purporting to show that information can
    indeed emerge quite rapidly by means of random "evolutionary" processes.
    He
    generated a random sequence of letters, which he mutated randomly. Each
    time a letter happened to equal the corresponding letter of a meaningful
    phrase previously deposited, it was and remained fixed. Of course, the
    process produced the "information" supplied after not too many
    generations!

    But let's look more closely at what really happens in evolution! Hubert
    P.
    Yockey ("A calculation of the probability of spontaneous biogenesis by
    information theory", J.theoret.Biol. 67 (1977), 377) compared the then
    known sequences of the small enzyme cytochrome c from different
    organisms.
    He found that 27 of the 101 amino acid positions were completely
    invariant,
    2 different amino acids occurred at 14 positions, 3 at 21, etc., more
    than
    10 nowhere. Optimistically assuming that the 101 positions are mutually
    independent and that chemically similar amino acids can replace each
    other
    at the variable positions without harming the enzymatic activity, he
    calculated that 4 x 10^61 different sequences of 101 amino acids might
    have
    cytochrome c activity. But this implies that the probability of
    spontaneous
    emergence of any one of them is only 2 x 10^(-65), which is way too low
    to
    be considered reasonable (it is unlikely that these numbers would change
    appreciably by including all sequences known today). A similar situation
    applies to other enzymes, such as ribonucleases.

    Thus, a modern enzyme activity is extremely unlikely to be found by a
    random-walk mutational process. But "primitive" enzymes, near the origin
    of
    life, may be expected to have much less activity and to be much less
    sensitive to variation. Unfortunately, before someone synthesizes a set
    of
    "primitive" cytochromes c, we have no way of knowing the effects of
    these
    factors.

    What we can do, however, is to estimate how many invariant sites can be
    expected to be correctly occupied by means of a random walk before a new
    enzyme activity becomes selectable by darwinian evolution (of course,
    such
    an invariant set may be distributed among more sites which are
    correspondingly more variable, without affecting the conclusions). So,
    let's start with some extremely optimistic assumptions (cf. P. Rüst,
    "How
    has life and it's diversity been produced?" PSCF 44 (1992), 80):

    Let's assume that all of the Earth's biomass consists of the most
    efficient
    biosynthesis "machines" known, bacteria, and all of them continually
    churn
    out test sequences for a new enzyme function, which doesn't exist yet in
    any organism. They start with random sequences or sequences having a
    different function. Natural selection starts only after a minimal
    enzymatic
    activity of the type wanted is discernable. In today's biosphere, t =
    10^16
    moles of carbon are turned over yearly, there are n = 10^14 bacteria per
    mole of carbon, a bacterium is taken to have b = 4.7 x 10^6 base pairs
    in
    its DNA. This yields R = tnb = 4.7 x 10^36 nucleotide replications per
    year
    on Earth.

    In protein biosynthesis, there are c = 61/20 = 3.05 codons per amino
    acid,
    a = 2.16 mutations per amino acid replacement (geometric average of all
    possible shortest mutational walks in the modern code table), a mutation
    rate of 1 mutation in m = 10^8 nucleotides replicated. Therefore, r =
    1/(c(3/m)^a) = 5.8 x 10^15 nucleotide replications are required for 1
    specific amino acid replacement (the factor 3 represents the codon
    length
    in the triplet code).

    In order to get s specific amino acid replacements, r^s nucleotide
    replacements are needed, and the average waiting period for 1 hit
    anywhere
    on Earth is W = (r^s)/R. For s = 1, W = 4 x 10^(-14) seconds; for s = 2,
    W
    = 4 minutes; for s = 3, W = 40 billion years!

    Thus the minimal set for a starting enzymatic activity cannot contain
    more
    than 2 specific amino acid occupations! Of course, for the origin of
    life,
    biosynthesis "machines" like bacteria were not yet available, and
    certainly
    not in an amount equalling today's biomass! Does it still sound
    reasonable
    to assume that biological information is easily generated by random
    processes? Or is there something wrong with the model underlying the
    above
    estimate?

    If God used only random processes and natural selection when He created
    life 3.8 billion years ago, we should be able to successfully simulate
    it
    in a computer. You may even cheat: the genome sequences of various non-
    parasitic bacteria and archaea are available. The challenge stands. By
    grace alone we proceed, to quote Wayne.

    Peter Rüst



    This archive was generated by hypermail 2b29 : Fri Sep 22 2000 - 07:49:23 EDT