Possible Role of Protein Modules in a Theory of Theistic Evolution

Science in Christian Perspective

Possible Role of Protein Modules in a Theory of Theistic Evolution

Gordon C. Mills*

Department of Human Biological Chemistry & Genetics
University of Texas Medical Branch

From: PSCF 50 (June 1998): 136-139.

In my previous proposal of a theory of theistic evolution, I discussed briefly the question of protein families. At that time I noted: "Ö groups of similar proteins, often with similar functions, share certain structural and sequence similarities, although some portions of the molecules may be quite different."¹ I also noted that I would not include each protein in these family groups as new genetic information. In the present paper, I wish to evaluate more recent studies on protein families and the similarities noted in portions of these protein molecules. In a great many protein families, the similarity is a consequence of having a particular modular group. It has been proposed that new functions of protein molecules may arise by transfer of gene segments in the DNA coding for these protein molecules.² These gene segments are expressed in proteins as modules, polypeptide units containing in most cases, 80ñ250 amino acids. Bork and Bairoch define protein domains and modules as follows:

The term protein domain is often used to describe a spatially distinct structural unit that has characteristic features, but does not have to be contiguous in sequence Ö Protein modules can be thought of as a distinct subset of protein domains Ö modules are contiguous in sequence, and are repeatedly used as "building blocks" in functionally diverse proteins.³

Bork and Bairoch also note that:

Ö the most propagated genetic spreading mechanism is believed to be "exon shuffling"Ö It assumes that modules are encoded by exons that are flanked by introns. If such exons are "shuffled," the introns function as buffers, preventing gene destruction. This requires phase compatibility of the flanking introns and those of the receiver gene.⁴

This theory of exon shuffling has limitations, however, since bacterial genes, which do not have introns, also appear to contain some modules in their protein molecules. However, bacterial genomes could have other types of recognition sites that would permit the "shuffling" of modules.

The evidence for this concept of module transfer comes from the finding that there are many diverse proteins with portions that are quite similar in amino acid sequence. In many cases, there is no significant amino acid similarity in remaining portions of the protein molecules. It is clear that the extent of amino acid similarity varies quite markedly in these protein modules, ranging from as low as 25% similarity in some comparisons to 80 or 90% in others. Nevertheless, even similarities of 25% cannot be explained as due to purely chance arrangements. With 20 different amino acids, chance arrangements would be expected to give similarity values of ca. 5%. It should be noted that reported amino acid similarity values are often maximized in computer matching by either insertion or deletion of one or more amino acids. The examples described below illustrate the different types of experimental findings that have led to the concept of "modular building blocks" in protein molecules.

Extracellular protein modules. Protein modules appear to be quite prevalent in mammalian extracellular proteins. In a recent summary, Bork and Bairoch indicate that about 60 different examples fit strict criteria for classification as extracellular protein modules.⁵ The extracellular proteins in which these modules appear have a wide variety of functions in organisms, ranging from the complement cascade for defense against infectious agents to components of the blood clotting system.

Intracellular protein modules. Ponting and Phillips scanned databases looking for a particular module, 80ñ90 amino acid residues long, called DHR. They identified the DHR module in 27 different proteins. Some of these proteins were involved in signal transduction at synaptic junctions. Others had a catalytic site, functioning as protein kinases, guanylate kinases, protein tyrosine phosphatases or neuronal nitric oxide synthases. All of the above catalytic proteins are involved in cell signaling.⁶

An interesting illustration of the use of modules in diverse organisms involves the eukaryotic initiation factor (EIF-2). Phosphorylation of the a-subunit of this factor by an EIF-2a kinase regulates protein synthesis during the process of translation. This regulatory kinase was studied in humans (RNA-dependent, designated PKR), in rabbits (heme regulated, designated HRI), and in yeast (designated GCN2).⁷ Although the greater portions of these three different protein kinase molecules have no amino acid similarity, each does contain two smaller modules in the amino acid sequence that do have similarity in the kinase catalytic domains. Each of these three different kinases has a different molecular size and each has a different regulatory mechanism.

Recent studies have shown the importance of a rapid breakdown of certain proteins by eukaryotic cells. A major pathway of removal involves a 26S (2000 kilodalton) tunnel-like structure which has, as a key catalytic component, a 20S proteasome. This eukaryotic proteasome is a barrel-shaped particle of four stacked seven-membered rings made up of 14 different, but related protein subunits.⁸ In an archaebacterium, Themoplasma acidophilum, a similar 20S proteasome carries out the same proteolytic function. This latter structure has only two types of subunits in the stacked rings. Studies of Seem¸ller, et al. have shown structural and amino acid sequence similarity of a þ subunit (a threonine protease) in the T. acidophilum proteasome with some subunits of human proteasomes.⁹ However, despite a high degree of three dimensional structural similarity, the amino acid sequence similarity for the ca. 210 amino acids of the two modules is only 28%. (Subunit HS-LMP7 of Homo sapiens vs. Ta-beta of T. acidophilum). Nevertheless, the amino acids in conserved positions that are required for catalytic activity are the same in the HS-LMP7 and Ta-beta subunits. Despite the low degree of amino acid similarity, these subunits are considered to be examples of modular structures, presumably arising from an ancestral modular sequence.

Other illustrations of modular transfer are given by Miklos, who has proposed that transfer of modules is one of the primary sources of new genetic information in eukaryotic organisms. He includes examples of modules in developmental genes, which could possibly have a role in morphologic changes in organisms.¹⁰

Significance of the concept of modular transfer. An interesting facet of the concept of transfer of genetic information as modules is the suggestion that gene cleavage sites for these transfers are not random, but would involve some kind of recognition site such as an exon-intron border, for cleavage and for transfer to another gene coding region. The mechanism for recognition would then be similar to that utilized in the specific cleavage of DNA introns in the process of forming messenger RNA. One should note that the concept of transfer of gene segments is limited to linear portions of those segments. Yet, when one speaks of protein domains, one is often thinking of a site on a three-dimensional molecule that might involve amino acids at remote positions on a linear chain. These positions would also be far apart on the corresponding gene as well. Consequently, the idea that one could have a transfer of genetic information for a complex protein domain with noncontiguous amino acids seems implausible at the present time. On the other hand, linear portions of a complex domain might still be transferred. This type of modular transfer could cause a change in the specificity of an enzyme for certain substrates without causing a change in the general nature of the reaction catalyzed by an enzyme.

There appears to be relatively little direct evidence for module transfer between genes; the evidence at present is primarily circumstantial and is based on module similarities as noted above. Whether transposable elements, which are involved in the movement of genes within cellular genomes, might be utilized for gene segment transfer is not clear at present. If future investigations provide additional support for the concept of providing new enzymatic activities by the transfer of gene segments (modules), can this concept be incorporated into my theory of theistic evolution?¹¹ A partial answer to this appears to reside in the apparent requirement for specific cleavage and reattachment of the segment to be transferred. One may postulate that some source of intelligence (an intelligent cause) would be necessary at some level to provide the required specificity of transfer. This gene segment transfer would involve both DNA cleavage (endonucleases) and DNA reattachment (ligases) or possibly nucleotidyl tranferases, with each enzyme having a high degree of specificity. Whether this activity and this specificity might be achieved by protein molecules (enzymes) or by highly specific RNA molecules (ribozymes), or both, is certainly not clear at present. It appears that these gene segment transfers cannot be explained as events of pure chance, such as those involved in usual mutations, since they appear to require specific cleavage and joining sites. There is no evidence to suggest whether gene segment transfer might occur during the process of cell division or whether it might occur when DNA strands of the cell are separated during processes of transcription or repair. Since these types of module transfer would be expected to occur only rarely, they may not prove to be demonstrable by direct experimentation. Certainly there is the possibility that genetic information controlling gene segment transfer might be present in the genome of cells and only rarely be expressed. It could remain dormant (repressed) for many years, with subsequent expression possibly, but not necessarily, being triggered by chance events. Possible triggering events might include highly stressful situations, such as starvation or major environmental change which are believed to have occurred during several major extinctions.

A key point in my proposed theory of theistic evolution was the need to distinguish carefully between transfer of genetic information and introduction of new genetic information.¹² When one considers the concept of transfer of gene segments (modules) from one gene to another, this would appear to fall clearly in the category of transfer of genetic information. However, in some cases there appears to be a new functional capability in protein molecules as a consequence of a transferred module. Often, this functional capability is due to an increased binding or a unique physical association with some other cell component (protein, membrane, organelle, etc.). In other cases, the new functional capability may be evident as a new capacity for catalyzing enzymatic reactions. Consequently, the concept of modular transfer of gene segments somewhat blurs the distinction I have previously made between transfer of genetic information and the provision of new genetic information.

Difficulties with the concept of modular transfer. Some recent studies illustrate the problems in interpreting proposed modular transfers. Aminoacyl-tRNA synthetases are absolutely essential to all organisms in the translation of genetic information in nucleotide sequences of messenger RNA into amino acid sequences of proteins. These enzymes catalyze the attachment of the twenty different amino acids to either the 3'OH or the 2'OH of the terminal adenosine of a specific transfer RNA (t-RNA). Structural studies by Nurecki, et al. have shown the three-dimensional similarity of some of these t-RNA synthetase molecules as well as their similarity in amino acid sequence. They especially compared modular portions of a glutamate t-RNA synthetase from Thermus thermophilus with a glutamine t-RNA synthetase from Escherichia coli.¹³ Although the two different enzymes have a high degree of structural similarity, the amino acid similarity of modular portions (277 amino acids long) of these two synthetases is only 23%. Did modular portions of these two synthetases arise from some archetypal module? If they did, it would have required the insertion of three short segments of 8, 12, and 14 amino acids each in the T. Thermophilus glutamate t-RNA synthetase; also four short segments (7, 2, 12, and 16 amino acids each) would have been inserted in E. coli glutamine t-RNA synthetase. Each of these insertions would have required precise recognition signals at each end for insertion. In addition, there would have to be amino acid changes to account for the 213 amino acid differences in modular portions of the two synthetases. The nonmodular portions of the two synthetases (86 and 235 amino acids, respectively) have no significant similarity. The comparison of these two synthetases provides an indication of how complicated this postulated modular change becomes when one examines the data carefully.

In this case, there is a change in function since the T. Thermophilus enzyme acts with glutamate and the E. coli enzyme with glutamine. A similar difficulty is seen when one examines closely the two modules in proteasomes described earlier, which have only 28% similarity. It should be noted that what constitutes a significant modular similarity is not always clear. Traut notes that subtilisin, a bacterial protease, and chymotrypsin, a proteolytic enzyme secreted by the pancreas, have a catalytic pocket that has both the same structure and critical amino acid residues, but otherwise their sequences are entirely different. Traut refers to this as an illustration of convergent evolution, since the two enzymes do not appear to have a common origin.¹⁴ Possible alternative explanations for postulated modular transfers and modular changes will be considered subsequently.

Theological aspects. Although much of the content of this paper is favorable to the concept of modular transport of gene segments, a word of caution should also be expressed in regard to interpretations from similarity data. As a Christian I have often noted that similarities, whether they are of function, metabolic processes, morphology, or amino acid or nucleotide sequences, need not always be interpreted as indicators of close (i.e., ancestral) relationships. Similarities must surely be an expression of the will of the Creator, who could work through chance events. I believe it is a mistake, however, to limit divine agency by insisting that only naturalistic explanations be considered. God's governance and direction could certainly be involved at a higher level. If a particular amino acid sequence and structure work in one organism, why should they not also be utilized by the Creator in some distantly related organism for a similar function? In comparing groups of modular sequences, the extent of similarity is quite variable, with many modules having similarities of only 20ñ30%. Does a 25% similarity mean that there was some ancestral sequence in the distant past for a particular module, from which all current sequences for this module have been derived, with the differences being a consequence of random mutations in variable portions of the modules over millions of years? No direct proof of this thesis appears possible. There have been some studies of fossil DNA sequences, but it appears unlikely that we will ever have enough fossil sequences of the types of modules described herein to provide any final answer to the question of possible ancestral relationships.¹⁵ Is there not also an alternative explanation for these modular similarities that considers the possibility of a creator providing a continuing infusion of genetic information into organisms as needed? Or at a higher level, could divine agency act as suggested by Van Till:

Ö every one of these processes and every connective pathway in the possibility space of variable creatures is itself a mindfully designed provision from a Creator possessing unfathomable intelligence.¹⁶

The answer to these questions may lie somewhere among these three differing views, but I believe a Christian should be careful not to reject by definition possible interpretations that consider the action and direction at some level of the creator. An openness to possible alternative explanations is essential for any research scientist.

Although the experimental evidence reviewed in this paper suggests that modular transfer of gene segments may indeed play some role in providing increasing complexity in higher eukaryotic organisms, the number of instances where this may be the case is still a small fraction of the total 50,000 to 100,000 genes in the human genome. Also, in most proteins that do contain modules there is a considerable portion of the protein molecule that is not modular. One must still account for the genetic information in these portions of protein molecules. Consequently, I believe the basic thesis of my theory of theistic evolution, that in the history of the origin and development of living organisms, at various levels of organization, there has been a continuing provision of new genetic information by an intelligent cause, to still be valid.¹⁷

^{1G.
C. Mills, "A Theory of Theistic Evolution as an Alternative to the
Naturalistic Theory," Perspectives on Science and Christian Faith 47
(1995): 112ñ22, p. 116.
^{2G.
L. G. Miklos, "Emergence of Organizational Complexities during Metazoan
Evolution," Mem. Ass. Australas. Palaeontols. 15 (1993): 7ñ41.
^{3P.
Bork and A. Bairoch, "Extracellular Protein Modules," an insert
distributed within Trends Biochem. Sci. 20, no. 3 (1995): 95ñ131,
produced on behalf of the participants of the International Workshop on
Sequence, Structure, Function and Evolution of Extracellular Protein Molecules
(Sept. 24ñ28, 1994 in Margnetetorp, Sweden).
^{4An
exon is a coding region of DNA; introns are intervening sequences of noncoding
regions that often occur within DNA coding regions of genes.
^{5P.
Bork and A. Bairoch, "Extracellular Protein Modules."
^{6C.
P. Ponting, and C. Phillips, "DHR Domains in Syntrophins, Neuronal
Synthases and other Intracellular Proteins," Trends Biochem. Sci. 20
(1995): 102ñ3.
^{7Eukaryotes
include all organisms whose cells contain a nucleus. This distinguishes them
from prokaryotes, which include eubacteria and archaebacteria. J. J. Chen, and
I. M. London, "Regulation of Protein Synthesis by Heme-regulated eIF-2
Kinase," Trends Biochem. Sci. 20 (1995): 105ñ8.
^{8A.
L. Goldberg, "Functions of the Proteasome: The Lysis at the End of the
Tunnel," Science 268 (1995): 522ñ3.
^{9E.
Seem¸ller, A. Lupas, D. Stock, J. L–we, R. Huber, and W. Baumeister, "Proteasome
from Thermoplasma acidophilum: A Threonine Protease," Science
268 (1995): 579ñ82.
^{10G.
L. G. Miklos, "Emergence of Organizational Complexities," 13, 30ñ2.
^{11G.
C. Mills, "A Theory of Theistic Evolution as an Alternative to the
Naturalistic Theory," and _óóó, "Theistic Evolution: A
Design Theory Utilizing Genetic Information," Christian Scholar's Review
XXIV, 444ñ58.
^{12Ibid.
^{13O.
Nurecki, D. G. Vassylev, K. Katayanaga, et al., "Architectures of
Class-defining and Specific Domains of Glutamyl t-RNA Synthetase," Science
267 (1995): 1958ñ65.
^{14T.
W. Traut, Book review of Proteolysis and Protein Turnover, J. S. Bond,
and A. J. Barnett, eds. (1993), in American Scientist 83 (1995): 377.
^{15For
a critique of these, see G. C. Mills,. "DNA Sequences in Miocene and
Oligo-miocene Fossils: Their Significance to Evolutionary Theory,"
Perspectives on Science and Christian Faith 46 (1994): 159ñ68.
^{16H.
J. Van Till, and P. E. Johnson, "God and Evolution: An Exchange," First
Things (June/July 1993): 32ñ46, p. 38.
^{17G.
C. Mills, "A Theory of Theistic Evolution as an Alternative to the
Naturalistic Theory," 114.}}}}}}}}}}}}}}}}}