Science in Christian perspective


Literary Statistics and Pauline Authorship 
II. Exposition and Critique

1143 Eastwood Drive, 
Mt. Pleasant, Michigan 48858

From: JASA 24 (March 1972): 18-23.

Part I surveyed the development of literary statistics and demonstrated some of the basic concepts inherent in literary statistics. Part II concentrates on an exposition of A.Q. Morton (1960ff) and then seeks to give a critique of the rationale and approach of literary statistics to biblical studies in terms of both Harrison and Morton Morton contends that no more than five of the 14 epistles traditionally attributed to Paul can safely be regarded as Pauline. He tests their authenticity in terms of sentence length distribution, frequency of kai (and) and de (particle) as primary measurements, The frequency of en, autos, and cinai are confirmatory tests. These tests are assumed to reflect an unconscious habit pattern of an author which is independent of time, circumstance or subject. The only limitation is that the piece must be prose.

It is the conclusion of this paper that literary statistics can be legitimately used by evangelical scholars within the framework of their view of Scripture. More specifically, it can provide a refined tool to study facets of style and language in the Scripture. Neither Harrison nor Morton have conclusively shown. that differences in style among the soealled Pauline Epistles are due to another author. However, their work has clearly pointed out specific differences in writing style and word usage which must be considered by anyone seeking to understand the literary dimension of the Pauline corpus. More basic work needs to be done to refine authenticity tests (parameters) and to specify minimum sample size for this type of work.


In Part I (Journal ASA 23, 96 (1971)), I surveyed the development of literary statistics, and analyzed in some detail the application of 'statistical" procedures to the Pastoral Epistles by Harrison. We turn now to an analysis of A.Q. Morton, who has published a great deal of work studying the Pauline Epistles.

The thesis of Morton's publications is that by statistical analysis it can be shown that no more than five of the 14 Epistles traditionally attributed to Paul can safely he regarded as Pauline (Haupthrife and most likely Philemon). Morton feels that at first the Church had no reason to question the authenticity of the Pauline Epistles. But for the past 150 years, scholars have sought to obtain more accurate knowledge of Paul and of Christianity itself. They have done the best they can with the tools available, but there has been little or no agreement concerning which letters are Paul's, and which are spurious. The authors examine literary criticism as a tool for studying Pauline writings, and conclude that it is nonuniform and too inconclusive to he of real value: "Literary criticism, however, widely interpreted is a blunt and awkward instrument for this kind of job; too imprecise and subjective to he decisive".2 They base this conclusion upon the contradictory findings and the widespread disagreement of the proper criteria for evalution.3

The authors also dismiss theological analysis as a valid way to determine the authenticity of the Pauline Corpus. The reason is essentially the same as that for literary, namely too much disagreement. The authors reach the following conclusion:

The best that can be done through theological acumen is very tar from yielding a firm basis for Pauline authorship ... It is due to the fact that literary and theological criticism are incapable of reaching firm conclusions . . . It is safe to say that hope of going further by these means is dead ... This of course noes not prevent theologians from proceeding as if tentative assumptions were as proven fact.4

Morton's Procedure

The authors indicate several reasons why they cannot accept the traditional authorship attached to the Pauline Corpus. Since prior attempts at this problem have been so subjective and have resulted in rather indecisive conclusions, Morton feels that it is manda tory that a more objective, scientific analysis be made. He tells us that external evidence is 110 solid basis for this study, and therefore work most begin with internal evidence. When we speak of internal evidence we become involved in the question of style. The word "style" when referring to an author is a rather nebulous term raising all kinds of difficulties. Thus, the authors tell us that for this work, style will be used to mean a very specific thing, namely the choice of words by all author:
For the purposes of this enquiry, composition is understood to he the selection of one word from a number of alternatives anti placing that word in a phrase, a clause, or a statement. And style is used here to denote the personal element its that choice.5

Morton then lists the factors which influence an author's choice of words, namely subject matter, cultural background, and simply personal preference. They go no to make the assertion (or supposition) that quality and content of prose depend upon rare or less frequently used words, whereas essential organizational structure depends upon some very common words. The point here is that these common words or filler words become habitual with authors, and thus are a good subject for stylistic studies Morton feels that the best way to express and assess an author's habitual use of these common words is to apply statistical analysis because we are working with variable quantities. Thus, Morton begins his development of a statistical test for author's style by setting forth the principles involved. The first one is the concept of probability; another important principle to grasp is that of the samplepopulation relationship. The point is that we can never be absolutely certain about authorship, we can only decide in terms of probability for and against certain authorship:

This relationship of the population and sample uses the pattern of argument for is test of authorship. The
basis of the test is that, its respect of the habit under examination, all the works of the author form a single
population and any of his works can be regarded as samples drawn from this population.6

The first step is to show that the work of a given author is statistically homogeneous, that it has a homogeneous variation among various parts of that work. Then, all the author's works are compared statistically to show that differences between the works are no greater than sampling differences, so that all the works of an author can be treated as a single population. The third step is to show that what is true of the first author is true of all the writers in a class, i.e., to show that you are dealing with general habits, and not simply the personal habits of an individual. The last step in this approach is to show that the tests are sensitive enough to be of practical value, i.e., exclude from the population of an author's work any of those which he didn't write. It is necessary here to employ a battery of tests to be sufficiently discriminating.

For students of the Pauline Corpus, the class of writer to he examined is that of writers of homogeneous, continuous Greek prose. Morton feels that Greek and prose are self explanatory, but that two of these words need explanation. "Continuous" is to ensure that samples are not made up from short prose insertions taken from between dialogue or verse, but blocks of prose taken as one piece. "Homogeneous" is used us the statistical sense to insure that data are of

The thesis of Morton's publications is that by statistical analysis it can he shown that no more than 5 of the 14 Epistles traditionally attributed to Paul can safely he regarded as Pauline.

all one kind, i.e., all drawn from the same population. Morton describes the approach as follows:

In summary, we are to look at a representative selec-tion of writers of Greek prose. In each of their) we will look at some habits which can be numerically expressed and statistically treated. The aius is to show that, in respect of these habits, all the works 0f the writer can be shown to be samples drawn from a single and stable population. The examination of halt a dozen habits should exclude from the population of genuine works any which are spurious just as half a dozen physical characteristics will enable is jury to decide if the accused was present at the scene of the crime or if some other man was involvcd.7

Morton carefully examines sentence length as one reliable test of authorship. Morton gives his definition of a sentence for this work:

...sentence is the group of words which end with a full stop ( . ), ss colon ( ), or an interrogation mark ( ? ) 8

Morton concludes that sentence length is a useful test for comparing authors style and determining authenticity. But it does have some qualifications; no exceptions were found when it was applied within its units of about a 5(1 year time interval and to homogeneous continuous prose. Furthermore, an argument based upon sentence length must he exclusive, that is you can never prove that two works were written by one author, only that two works cannot have been written by one author. There are six Greek words which make up nearly thirty-one percent of the whole New Testament text, and generally speaking these six are most frequently used by all Greek prose writers. Among these, five are considered adcquate to be tested for sensitivity its determining style differences (see Table 12 its Morton's appendix). Thus, Morton seeks to test these five eonsnson words plus sentence length as means to express or compare style differences. These three methods of statistical tests ale: standard error, Poisson distribution and word intervals. The Chi-square test is the way each of these is compared for deviation from the expected distribution. Morton then tests these words in a comprehensive sampling of classical Greek prose writers. Based on this examination of Greek prose writers, Morton makes the following summary:

1. Greek prose writers have habits winds persist over long periods of time and wide ranges of subject
matter. Comparisons within the same literary genre can he made with confidence and precision. 
2. The habits of writers are affected by change of genre, and the comparison of works of widely differing genre should be made ss'itls care and reservations.
3. In all works of his kind, the text should be examined to see that the habit is representalive of the work and not affected by some sections of the work which may be quite unlike the work as a whole.

He concludes that sentence lengths, occurrence of Ices arid nc at the start of sentences are primary tests.
The frequency of en, autos and eioai are confirmatory. Morton reports that new tests are under study, so this is not the final word. However, Morton does feel that the basic findings have been made and we need only to refine the tests and data.9

Morton tells us that he is not interested in making a precise classification of literary form of the Pauline writings, but merely to find out if the Pauline Corpus contains the difference in literary form found in Isocrates to create significant statistical differences. Perhaps the biggest problem with the Pauline writings is that many of the samples are so short that one could only hope to detect "gross stylistic differences". AQM then reports the findings for sentence length, and the frequency of the common words observed in Paul (kai, de, en, autos, and einai). Morton reaches the following conclusions concerning the analysis of the Pauline Corpus: The Houptbriefe form a statistically homogeneous group. Between this group and the other Epistles (except Philemon) a large number of significant differences exist. These differences are larger than any differences known to exist its the writings of Greek prose regardless of literary form or any other factor. "It is impossible to explain these differences without assuming a difference of authorship".10 Morton goes on to elucidate this conclusion as follows:

()nce it is accepted that the first four major Epistles are by a single author the question arises of deciding who he was, in all this book it is assumed, by definition that Paul is the man who wrote Calatians, and so Paul is the author 0f all of Calatiaos, and I Corinthians and of most of Romans and H Corinthians. He may well have written Philemon; there is no evidence which would deny him the authorship of this Epistle. As soon as you turn to the other Epistles, the argument descends to a lower level of certainty. The precision of the argument in the first four Epistles derives from our having the four Epistles to examine and having three of them large enough to divide into samples and test for homogeneity. It appears that the remainder of the Epistles come from several hands. Hebrews is unique, as are Epbesians, Philippians and Colossians. I and II Thessalonians make a pair, as do I and II Timothy. But in each case the decision is made on much less evidence than one would wish to have. It is all the evidence we have and so must logically be accepted, but it should be understood that the two statements, that Paul wrote the four major Epistles and the others come from six hands rests upon two different degrees of certainty corresponding to the evidence which is available.11

Critique of the Statistical Approach

The primary purpose of this critique will concern the statistical approach in general, rather than a detailed critique of either Morton or Harrison. I would like to structure this evaluation in terms of the follow
ing questions. Can statistics be applied to prose litera-ture? Then, more specifically, is it legitimate to analyze the Word of God with a statistical approach? Third, what specific function and/or value does statistics have in analyzing Biblical writings?

1. Can statistics he used in literary criticism.

It should he quite clear that whenever numerical data can be accumulated, statistics can be applied to help interpret. As the historical survey showed earlier, considerable work has been done already in applying statistics to literary analysis. The analysis of Grayston and Herdan, quoted earlier, along with the Biblical data accumulated by Morgenthaler shows conclusively that vocabulary and word usage data can he tabulated and analyzed; Van Elderen has demonstrated that Greek participles also can be tabulated and analyzed; Morton is now in the process of tabulating grammatical differences. Furthermore, Morton's rationale and approach summarized in this paper clearly demonstrate the utility in analyzing variable quantities (words). As Morton further points out, words are not produced by a random generator, but come in context. However, all the work that has been done clearly demonstrates that words do occur in patterns which closely approximate a random distribution; and thus statistical techniques apply to prose in general, and Biblical prose in particular.

Linguistic studies, word usage and freauencu help us to characterize the form of an author's style and define the literary dimension of Scripture.

2. Is a statistical approach legitimate for the Bible?

The question really reduces to this: What about using statistics to analyze Scripture within the context of our belief that it is the inspired word of God? To answer this question, we must begin with our own view as to how we received the Bible, and the nature of inspiration. "The Bible didn't fall from Heaven, but originated and grew in the Church of God". The books of the Bible were written by human authors in terms of their own personalities, styles and perspective upon the situation. The writing of Scripture, was not by mechanical dictation by the Spirit, but the Spirit used each personality with his own talents, training and experiences to convey God's word in a given historical situation; the authors used sources, reflection, and selection of material but were superintended by the Spirit. Thus, the principles of Scripture normative for us are imbedded in the context of another time its history, and in a radically different culture. In order to discern their meaning we first must understand the meaning for that day (exegesis) before we can apply the meaning for our time (exposition). In this, we recognize that we are working with translated copies and so we apply textual criticism to arrive at the best textual source. Simultaneously with this, we need to understand Scriptural language and cultural context as well as study grammar, word meanings and total context in order to arrive at the proper understanding; this is an ongoing process to determine the most accurate interpretation. Therefore, we apply the his historico-grammatical approach to pinterpretation of Scripture. The testimony of Scripture doesn't lay behind the writing, but within the matrix of words, grammar and syntax. Thus we must understand the mind of the writer as much as possible, as well as his vocabulary and style. Linguistic studies, word usage and frequency helps us to characterize the form of an author's style and define the literary dimension of Scripture. The following quotes from G.E. Ladd 12 convey the meaning very nicely:

Literary criticism is the study of such questions as the authorship, date, place of writing, recipients, style, sources, integrity, and purpose of any piece of literature. If the Bible had fallen directly from heaven, or had been verbally dictated by the Holy Spirit, literary criticism of the Bible would be irrelevant. If, however, the Holy Spirit used men in given historical situations to be vehicles of the Word of God, then we must try to recover that historical situation by asking critical questions. This is especially true if the Word of God for the entire church was given through the medium of a particular church facing specific problems. We cannot adequately understand the abiding message of God's Word until we have interpreted its particular immediate message in terms of the historical situation. When we study the letters of Paul addressed to individual churches, we must try to interpret what Paul wrote in terms of all we can recover about the situation in the church to which the letter is written.13
Thus the Bible is indeed the inspired Word of God, the Christian's only infallible rule for faith and practice. But the present study has attempted to demonstrate that the truth of infallibility does not extend to use preservation of an infallible text, nor to an infallible lexicography, nor to infallible answers to all questions about authorship, date, sources, etc., nor to an infallible reconstruction of the historical situation in which revelatory events occurred and the books of the Bible were written. Such questions God in His providence has committed to human scholarship to answer; and often the answers most be imperfect and tentative. A proper evangelical, biblical faith suffers a serious disservice when the spheres of Biblical authority and critical judgment are confused.
Although the truth of the Bible is not dependent upon our ability to answer critical questions, it is quite clear that our understanding of the truth of the Bible is enlarged and rendered more precise by such study. A proper biblical criticism therefore does not mean criticizing the Word of God but trying to understand the Word of God and how it has been given to man.14

We turn now to the final series of questions.

3. What specific function or role does statistics have in analyzing Biblical literature?

I would like to answer first in terms of its potential value, and then discuss what the two studies (Harrison and Morton) have accomplished. We have just discussed the historical (culture and time, textual transmission, textual criticism, occasional nature) and human dimension (thought patterns and idiom, personal style) of Scripture. We have also indicated that there is a literary dimension (vocabulary, syntax, thought pattern); all are subject to critical and exegetical studies in order to learn Scriptural meaning for us. The use of probability and confidence limits applies to our hypotheses and theories concerning authorship, style, word usage, textual criticism, etc.

Statistics is not normative nor does it objectively decide authorship .... It is a tool which can he misused or applied to good advantage within the limits of its capabilities.

Again Ladd has described this very well:

But evangelical laymen as well as ministers and teachers need to understand that God, in His providence, has given the Word of God to the church through historical events and processes which cannot always be recovered. It is the task of criticism to reconstruct the historical situation so far as it is possible. Since our knowledge at many points is scanty, we often cannot accurately speak of facts, but only of probabilities, possibilities, hypotheses. This is precisely what the rationalistic critic must do. Indeed, the history of criticism is the story of use ebb and flow of critical theories, out of which have emerged many positions so well established that they may be recognized as facts. The evangelical critic must also construct his theories and hypotheses; he must constantly differentiate between facts and theories; but he will establish hypotheses which are consistent with the total biblical data including its doctrine of revelation and inspiration.15

However, as Ladd concludes, all these critical studies must be from the following stance:

Here is perhaps the greatest miracle of the Bible; that in the contingencies arid relativities of history God has given to men His saving self- revelation in Jesus of Nazareth, recorded and interpreted in the New Testament; and that in the New Testament itself, which is the words of men written within specific historical situations, and therefore subject to the theories and hypotheses of historical and critical investigation, we have the saving, edifying, sure Word of God. In hearing and obeying the Word of God, the scholar must take the same stance as the layman: a humble response which falls to its knees with the prayer, Speak, Lord for thy servant heareth.16

The crucial point which needs to be made is that statistics is not normative nor does it objectively decide authorship. It only shows (formal) differences in terms of the chosen parameters. The differences detected are only as valid and descriptive as the parameters we choose. Secondly, and perhaps most important, it shows (formal) differences with a certain probability, but it gives no content to those differences. This is an interpretation which must be made by people. Their interpretations will be governed first by their presuppositions concerning Scripture and second, by the meaning of inspiration. That is, even if one accepts Scripture as the Word of God, he may have a slightly different meaning for inspiration and this will influence his interpretation of the differences. Factors such as how lie views the use of an amanuensis, occasional nature, organic inspiration, mechanical dictation, etc., will affect his interpretation of those differences. As discussed previously in this section, linguistic studies of word and grammar usage (and frequency) can help us to gain insight into the meaning of organic inspiration. It shows the variety of formal style of an author, his diversity in vocabulary, in grammar, or it can suggest that perhaps two works are radically different in vocabulary and word frequency. This is only repeating that statistics is a tool which can be misused or applied to good advantage within the limits of its capabilities.

The following are a summary of the specific ways in which statistics can be helpful its studying style, and biblical linguistics in general:

a. The distribution functions (normal, Poisson, binomial) give us "handles" to describe vocabulary and grammar patterns in a comprehensive way. Chi-square allows us to compare distributions.
b. Greatest value of statistics here perhaps is in terms of data reduction. Literally hundreds of measurements (word counts etc.) can be reduced to two numbers, average and standard deviation (data range); thus facilitating comparisons, data presentation and allowing general trends to become apparent.
c. It gives a systematic and quantitative way to use smaller samples to measure the probability of differences among large populations i.e., how much of a book to measure, how many words to count, etc.

Even if significant stylistic differences are shown to he present, how do we know that style as defined is singly decisive for authenticity?

d. It puts a quantitative confidence limit upon measuring differences (from the human point of view, it lets us put a number upon the risk involved in determining differences). In short, it gives us a tool to help measure and deal with the inherent variation we find in all things, whether it he the number of times a person uses the cord "and", and the heights and weights of people, or the tensile strength of metal.

Harrison concluded on the basis of his work that the Pastoral Epistles were written in the second century by a "Paulist", but they contained some authentic Pauline fragments. Morton concludes that he has objectively and decisively shown that Paul is the author of only Galatians, Romans, I and II Corinthians; the Pastorals along with the other letters (except perhaps Philemon) were written by several other authors.

Harrison's work was more in the popular mode of statistics, whereas Morton's approach is more the science of statistics. However, tabular data, counts and averages are the starting point for the science of statistics, so these approaches differ only in quantity rather than in quality. Hence they can be evaluated basically as the same approach. I should point out that the big difference between llarrison's approach and that of Morton is that the more advanced approach gives one a quantitative estimate of results. In other words, the probability or confidence limits, proper distribution parameters and reduction of the data to workable parameters cannot be adequately accomplished merely in terms of totals and averages. Therefore, whenever possible it is preferable and in many cases necessary to calculate distributions, standard deviations and make chi-square comparisons. In that sense, Harrison's work was incomplete and not quantitative.

One more general comment needs to he made concerning objectivity. Harrison implicitly indicated he was being objective in his work and Morton expressly asserted pure objectivity. A semblance of "objectivity" can he attained in distinguishing differences such as sentence length or number of kais (as compared with intuitive, qualitative judgments of one individual). However, even with the statistical results indicating a high probability of difference in certain characteristics, the significance or meaning of that dffcrcnce still must be interpreted by the tester, and here is 'where the presuppositions of Morton and harrison have dictated the conclusions. We can state categorically that neither has a claim to pure objectivity.

More specifically, the crucial worth of statistics depends upon the correct parameters. The results of statistical analysis are only as good as the observations and especially the parameters used. Do sentence length, kai, Hapax, frequency of word usage, etc., adequately define a certain author's works? What characterizes an author comprehensively enough to function as a test to define his work? Is the concept of literary style sufficient and comprehensive enough (style is only a formal characteristic as it is usually defined)? The
Pastorals have been questioned in essentially four areas, as indicated on the first page of this paper. Linguistic style, or w ord usage is only one of the four listed, and yet there seem to be other major differences. Both Harrison and Morton make authenticity judgments on linguistic style (word usage) alone, without regard to other criteria. The obvious question arises then, even if significant stylistic differences are shown to be present, busy do we know that style as defined is singly decisive for authenticity? This can only he an assumption, or at best in the case of Morton, a first approximation. Morton at least tested his parameters extensively on Greek authors before applying them to Pauline literature. However, be then assumed they would apply to the Epistles (although AQM feels he has proved the universal validity of his parameters). Morton didn't take into consideration the differences between classical and Koine Greek, nor Paul's use of an amanuensis or the occasional nature of the Epistles; he felt these were insignificant aspects. Morton especially must chow that his parameters apply equally as decisively to Knine Greek.

Both Harrison's and Morton's data indicate clear differences in the Pastorals compared with the other letters. All of us must deal with these differences. But we clearly must challenge conclusions which state that on the basis of statistical data alone there is no question that some other author wrote the Pastoral Epistles. Due consideration must he given to other possible explanations such as amanuensis, occasional nature, style differences within Paul, interaction between the amanuensis and Paul's dictation method, etc.

Summary and Conclusions

The following are a summary of the aspects of a statistical approach to literary analysis which should be particularly noticed:

1. Statistics is not objective; the results require interpretations, and thus are influenced by presuppositions. There can be no claims to pure objectivity or ultimate authority.
2. Does linguistic style alone decisively characterize authenticity? Statistical analysis is a purely formal test.
3. What parameters adequately define style?
4. There is a major problem concerning the minimum sample size which allows a valid test.

The following conclusions can be drawn from this analysis:

1. Statistics can legitimately he applied to prose analysis; however, more work needs to he done to refine (perhaps "develop" would be more appropriate) parameters.
2. The statistical approach decisively shows that there are differences in the Pastorals compared with the rest of the Pauline letters (in terms of the given parameters) It has highlighted certain stylistic and linguistic features in the Scripture.
3. The question is still open concerning specific content or the cause for these differences; it has not been conclusively shown by statistics that the differences are due to a different author.
4. Neither Harrison nor Morton have answered the above problem areas in literary statistics.
5. Literary statistics is a tool which can he used legitimately by evangelical scholars within the framework of their view of the Bible. Specifically, it gives us a refined tool to study facets of style and language.


1Brought up to date in Foal, The Man and the Myth, (New York: Harper and Row,1966). AQM books are co-authored, but the approach and concepts are basically Morton's ideas. 
Ibid., p. 23. 
lbid., p. 25ff. 
lbid., p. 37. 
Ibid., p. 43.
6bid., p. 49. 
lbid., p. 51. 
lbid., p. 54. 
Ibid., p. 88. 
Ibid., p. 94. 
Ibid., p. 94. 
Ladd, G.E. The New Testament and Criticism. Grand Rapids: Ecrdnians, 1967. 
lbid., pp. 112-113. 
Ibid,, pp. 216-217. 
Ibid., p. 16. 
Ibid., p. 218.