Re: phylogenetic reconstruction

R. Joel Duff (joelduff@nls.net)
Tue, 7 Dec 1999 08:36:03 -0400

>Teichmann, SA and Mitchison, G. "Is there a phylogenetic signal in
>prokaryote proteins?" J. Molec. Evol. 49:98-107
>
>Abstract
>
>Using the sequence information from nine completely sequence bacterial
>genomes, we extract 32 protein families that are thought to contain
>orthologous proteins from each genome. The alignments of these 32 families
>are used to construct a phylogeny with the neighbor-joining algorithm. This
>tree has several topological features that are different from the
>conventional phylogeny, yet it is highly reliable according to its bootstrap
>values. Upon closer study of the individual families used, it is clear that
>the strong phylogenetic signal comes from three families, at least two of
>which are good candidates for horizontal transfer. The tree from the
>remaining 29 families consists almost entirely of noise at the level of
>bacterial phylum divisions, indicating that even with large amounts of data,
>it MAY NOT BE POSSIBLE TO RECONSTRUCT THE PROKARYOTE PHYLOGENY USING
>STANDARD SEQUENCE BASED METHODS. (Emphasis mine)
>
>If it won't work in bacteria, what hope is there for higher organisms??? I
>predict that the signals will become less and less clear, the more we know.

Art,
I don't find this result surprising really. I don't have this issue of
JME on hand but I will make several observation from the abstract.

1) Notice the phylogenetic signal appars to come from families of genes
used may be examles of horizontal transfer which would not only result in a
false topology but would tend to be the best supported because they
transfer wouldn't be as old as the taxa and thus the sequence divergence
wouldn't be as great in these genes. The reason rDNA genes (16S and 23S)
have typically been used in these studies (and this is probably what they
are comparing their results to) is that they are the only genes thought to
not be suseptable to horizontal transfer. Also they are typically much
more conserved than protein coding genes.

2) "thought to contain orthologous proteins" Determining homology is a big
problem with these protein coding gene families. This is the reason mtDNA
and cpDNA genes are so widely used in animl and plant phylogenies rather
than nuclear DNAs, unfortunately no such organellar genomes exist for
bacteria.

3) There are very few protein coding genes that are conservative enough to
have any resolving power at the level of divergenet bacterial groups. To
much divergence and the phylogenetic signal gets swamped by homoplasious
changes. In this case many of the genes were certainly too variable and
thus lead to huge amounts of homoplasy. Why then did they get strong
bootstrap support for their tree? I would suggest two reasons: 1) most of
the genes they used had high homoplasy levels which ended up negating the
signal leaving the couple of genes with strong signal to dominate the
analysis 2) the small number of samples and more importantly the
divergence of these samples. As they increase the number of samples the
support for each clade will likely decrease.

Hope for higher organisms? Certainly. Bacterial phylogenies will likely
always be extremely difficult to reconstruct. But, large, combined data
sets of genes with the appropriate levels of signal are being shown to be
very usefull in many eukaryote groups. My own study (just deemed not
interesting enough to be accepted into Science :-( ) includes sequence of
four genes from three subcellular organelles (18S - nucleus; 19S
-mitochondria; 16S and rbcL - plastid) for representatives of the major
land plant lineages. Combining the data results in a highly resolved
phylogeny of land plants that could not be achieved with a single gene.
Furthermore the generated phylogeny is more intuitive than any of the
single gene analyses each of which suffer from pathological features of the
particular sequence (ie. long branch attractions, uneven patterns of rna
editing, transition/transversion biases etc...).

So I would agree that it may not be possible to reconstruct the prokaryote
phylogeny using standard sequence based methods but I would not go as far
as suggest that this spells doom on other studies of higher organisms.

Just a few observations.
Joel

-------------------------------------
R. Joel Duff, Assistant Professor
Dept. of Biology, ASEC 185
Campus Mail 3908
University of Akron
Akron OH, 44325-3908
Office: 330-972-6077
rjduff@uakron.edu
-------------------------------------