Lateral gene transfer

R. Joel Duff (Duff@siu.edu)
Wed, 14 Aug 1996 16:23:14 -0500

>Question for Joel Duff (or anyone else on the list):
>
>What experimental -- as opposed to post hoc inferential -- evidence
>is there for lateral gene transfer? I'm pretty familiar with the
>literature on phylogenetic claims, i.e., sequence X is sitting where it
>looks rather odd: postulate a lateral transfer. But what "real time"
>(if you will) studies exist showing the acquisition of foreign genes by
>a population or species?
>
>Here's why this matters. I'll cite a nicely provocative paper by
>Schwabe and Warr:
>
> We believe that it is possible to draw up a list of basic
> rules that underline existing molecular evolutionary models:
>
> 1. All theories are monophyletic, meaning that they all
> start with the *Urgene* and the *Urzelle* which have given
> rise to all proteins and all species, respectively.
>
> 2. Complexity evolves mainly through duplications and
> mutations in structural and control genes.
>
> 3. Genes can mutate or remain stable, migrate laterally
> from species to species, spread through a population by
> mechanisms whose operation is not fully understand, evolve
> coordinately, splice, stay silent, and exist as pseudogenes.
>
> 4. Ad hoc arguments can be invented (such as insect vectors
> or viruses) that can transport a gene into places where no
> monophyletic logic could otherwise explain its presence.
>
> This liberal spread of rules, each of which can be observed
> in use by scientists, does not just sound facetious but
> also, in our opinion, robs monophyletic molecular evolution
> of its vulnerability to disproof, and thereby of its
> entitlement to the status of a scientific theory.

Assumption #1 and this last statement is where the difficulty lies. The
use of "all theories are monophyletic" is a statement of practicality with
respect to phylogenetic analyses. Yes, computer models such as PAUP and
other programs have written into their code the assumption of monophylicity
but this does not preclude the concepts of paraphyletic groups and other
situations that may arise through horizontal tranfer from actually
representing reality. Such programs have been treated as a black box -
sequence data goes in and your phylogeny pops out the other side but these
programs abilities to estimate phylogenies are only as good as the
assumptions and algorithms written into them. THERE is NO WAY at present
to deal with (predict) paraphyletic relationships in these algoriths. In
addition the concept of parsimony is prevelent in these analyses which
looks to find the simplist explanation for the data at hand. One should
always be aware doing these analyses of the assumptions in the algorithms.
Therefore the presence of "strange" data is often a sign that one of the
assumptions of the simplistic algorithm are not being held up. It doesn't
necessarily invalidate the entire process but may call into question one
portion of the algorithm. For example in a data set I may be able
hypothesize a specific set of relationships between taxa A, B, C, and D
which require 15 mutations while the same data could identify a different
set of relationships which would require 17 mutations. Every algorithm I
know of is going to opt for the explanation requiring 15 mutation as the
"best estimate." If in reality this is not the case I don't say that
something is terribly wrong becasue my analsis has given me the wrong
answer but I recognize that parsimony was a large part of my analysis and I
recognize that many times the simplest means of reaching a particular point
in evolution isn't the way it may be done in reality. Look at all the
examples of biochemical pathways which are not as efficient as they could
be.

So while it is true in some sense that all theories are in some sense
monophyletic in many cases this is out of necessity for any understanding
to be achieved.
The algoriths we use today to "estimate" phylogeny are only as good as our
understanding of evolutionary processes. I admit this may appear to buffer
phylenetic studies from vulnerability to disproof. Progress results from
having simple models and recognizing that when our expected result is not
met that we may have to change our models. I'm not unaware of the
circularity that this may seem to present but ultimately one can not tweek
the model any more and it will crumble all together. Witness the efforts
over time to fit astronomical data into a geocentric view.

Joel

>
>(C. Schwabe and G. Warr, _Perspectives in Biology and Medicine_ 27
>[1984]: 465-485)
>
>Actually, Schwabe and Warr tell only half the story, and not even
>the scariest half.
>
>It is standard practice in molecular phylogeny construction to determine
>sequence alignment, the first analytical stage after the raw sequence
>data are obtained, more or less "by eye" -- and then to assess the
>reliability of that alignment by checking the phylogeny it produces. If
>you get a wild phylogeny: well, must be the wrong alignment.
>
>[A recent example of this sort of method, which should make anyone's hair
>stand on end, is Christopher Wills's paper "Topiary Pruning and Weighting
>Reinforce An African Origin for the Human Mitochondrial DNA Tree,"
>_Evolution_ 50 (1996): 977-989. Wills "prunes" (i.e., manipulates) human
>mtDNA sequence data iteratively until he gets the phylogenetic pattern he
>wants, because the starting pattern shows a starburst phylogeny with the
>primate outgroup located "very asymmetrically" from the root of the human
>starburst. Basically Wills tidies up the data until he gets the phylogeny
>he likes.]