Does Scripture Support Standardized Testing

Mathematics/Computer Science

Does Scripture Support Standardized Testing?

HAROLD W. FAW

Trinity Western University
Langley, British Columbia, Canada

A standardized test involves observations of an individual's behaviour made under specified conditions for the purpose of meaningfully comparing it with that of other people. Because of the very extensive use of these instruments, a storm of criticism has arisen and a great deal of misunderstanding surrounds them. While the critics have often been right, standardized tests do yield information which can facilitate good decision making. Provided we have a clear understanding of their limitations and we use them with a view toward service, standardized tests may indeed help to move us toward the goals of justice and equity.

Tucked away in the seventh chapter of the book of Judges, the Bible recounts a fascinating tale of personnel selection. Having received assurance through his "fleece test" that God intended to use him to deliver Israel, Gideon assembled an army of some 32,000 men to face the Midianite hordes. Unfortunately, God said that he had too large an army to do the job since the Israelites would be inclined to take the credit for victory. In the first stage of the selection process, one with considerable face validity, all those who admitted to being fearful were dismissed! As a result, only 10,000 remained. However, God evidently wanted a hand-picked group, since the results of the second stage of the process were even more dramatic. Gideon was instructed to bring the 10,000 would-be warriors to the water's edge. Those who passed this behavioural test by lapping water from cupped hands numbered a mere 300 men- the exact number God needed to effect a remarkable victory.

A few pages later in the same book (Judges 12) we find the less familiar record of a selection test with absolutely dichotomous outcomes- life or death. To determine the truth or falseness of an Ephraimite's denial of his tribal identity, Jephthah's men simply asked their suspect to say the word "Shibboleth" aloud. If he was unable to correctly pronounce the initial "sh" sound, he was judged to be lying; 42,000 unfortunate victims failed this decisive test on that occasion.

These two accounts illustrate the fact that procedures we would now categorize as "psychological testing" are not an invention of the current century. Nor are they unique in ancient times to the nation of Israel. Philip DuBois (1976) documents the extensive use of achievement testing in China over a period of 3,000 years. Though the content and some of the procedures involved were gradually modified, their basic purpose of selecting persons suitable for public office was retained. As early as 1115 B.C., candidates were required to demonstrate proficiency in the five essential areas of mathematics, music, archery, horsemanship, and writing. In a later era, moral qualities of integrity and piety were also taken into account, along with knowledge of law, finance, geography, agriculture, and military matters.

While psychological testing is clearly not a new phenomenon, its systematic and extensive application has been a prominent feature of only the present century. Unlike the significant though limited use of examinations by the ancient Chinese, in our culture there are tests for nearly every conceivable purpose, and practically everyone's life is influenced in one way or another by these devices. It has been estimated that over 2,500 different psychological tests are currently in use in the United States, and that more than 200 million copies of these tests are marketed annually (Weiner & Stewart, 1984). The Scholastic Aptitude Test alone, faced as part of the admissions ritual by applicants to many North American colleges, is taken every year by more than 1.5 million hopeful high school graduates (Chance, 1988). Comprising some 200 multiple choice questions and taking three hours to complete, the SAT is designed to measure abilities in both mathematical and verbal areas. As a group, these students pay over $20 million for the privilege of enduring these three hours of sweat and toil.

It is evident that standardized psychological tests are a major component of twentieth-century North American culture. Despite persistent criticisms of tests and their creators, it is almost certain that they are here to stay. It therefore behooves us to be informed as to their nature and influence, and to thoughtfully evaluate the significant role they play in our lives.

The Nature of Standardized Tests

A good deal of confusion surrounds the development and use of psychological tests. While a definition will not solve the problem completely, it may provide a helpful starting point. A fairly typical one is given by Lee Cronbach (1984) in his widely used text Essentials of Psychological Testing: "A test is a systematic procedure for observing behaviour and describing it with the aid of numerical scales or fixed categories" (p. 26). Probably the key word in this statement is "systematic." We frequently observe and describe other people's behaviour, even using numbers to do it (e.g. "a 110% effort"), but seldom is the whole process an orderly one. Consequently, the results obtained are of fairly limited value.

Fundamentally, a test involves observing and recording a sample of someone's behaviour. The purpose of the observation is to determine the amount of a particular characteristic (extroversion, numerical skill, etc.) possessed by the individual. In order to make this kind of inference with any confidence, the observations need to be made in at least partially controlled or specified circumstances. For example, if records of how much talking different people do to others around them are sometimes taken in shopping malls, sometimes in staff lunch rooms, and sometimes at birthday parties, the results are not comparable and the observations tell us less about the individuals we are testing than about the situations they are made in. In order to achieve interpretable observations, test users typically examine the behaviour of different people in fairly similar settings.

This characteristic of making the observations in prescribed environments is what gives rise to the notion of "standardized testing." Many observations of behaviour are intended to infer something about the individual, but lack this quality of transferability. For example, the exam I give to my Introductory Psychology class is designed to help me determine how much of the discipline each of the students has understood and retained. However, the specifics of this exam reflect the textbook we use, the additional readings I assign, the tone of class discussions, and the special emphases I make in my lectures. Thus, it is not particularly suitable for someone else's Introductory Psychology class, nor even for my own class on another occasion. My exam lacks standardization, or the characteristic of being designed and administered so as to make results obtained at different times and in different places more meaningfully comparable. Without doubt, standardization is a matter of degree; some tests have nation-wide applicability, while others are of more limited local use. Tests created and used by teachers and professors for their students are not generally regarded as standardized tests. Rather, this term is reserved for those instruments that are developed, published, and made available for more widespread use. Though many of the same ingredients are needed to create good tests for classroom use, the focus of the present discussion is on standardized testing.

Procedures we would now categorize as "psychological testing"
are not an invention of the current century.
Nor are they unique in ancient times to the nation of Israel.

In addition to being standardized to different degrees, published tests also vary widely in the domain being assessed as well as in the quality of the instrument. The realm of testing is often divided into the cognitive and non-cognitive domains, reflecting evaluation of abilities and achievement in mental functioning on the one hand, as opposed to variations in personality traits, interests, attitudes, and beliefs on the other. The former are somewhat more clearly defined, and right or wrong responses can readily be specified. The ability areas are hence easier to measure meaningfully. The personality and interest domains involve characteristics that are inherently fuzzy and elusive, making their assessment particularly hazardous. Tests of these characteristics are generally of lower quality psychometrically than are measures of ability.

Beyond these two traditional realms of assessment lies the whole area of situational testing in which the context of evaluation is quite similar to what one encounters in daily life. Examples would be the practical portion of a driver's licence exam or a test given to police recruits in which we systematically record their ability to notice details during simulated job activities.

Published tests are generally evaluated on three major criteria. The first, standardization, reflects the extent to which the instrument can be meaningfully employed in a variety of times and places. This depends on the care with which the test has been constructed and the adequacy of the normative data available. Norms provide the basis for score interpretation. The second and third criteria, reliability and validity, require some further comment.

To the extent that the scores produced by a test are accurate or consistent, we have a reliable measuring tool. A tape measure, for example, is a reliable measure of a person's height since multiple measurements of the same individual taken at various times and different places will yield results that are very nearly equivalent. Similarly, an IQ test which yields a score of 106 is relatively reliable if the same person earns 108 next week, but not very reliable if he/she scores 131 on a subsequent occasion. Although reliability reflects how much a person's score would vary across different testing occasions, it is usually determined by retesting a whole group of people and noting the extent to which the ordering within the group (best, second best, third best, etc.) is preserved. Further details of how reliability is estimated and what affects it are discussed in numerous books on testing (see for example Essentials of Psychological Testing by Cronbach). For our present purposes, it is to be emphasized that reliability is a criterion by which some tests look fairly good and others are seriously lacking.

The most crucial ingredient of a good test is validity. It can be simply defined as the extent to which a test measures what it claims to measure or achieves its stated purpose. For example, if a test is marketed as a tool for selecting used car salespeople and those who pass it sell three times as many cars under similar conditions as those scoring below a designated minimum grade, the test's claim to predictive validity is warranted. On the other hand, if an intelligence test purports to measure innate general reasoning ability, but more specifically reflects familiarity with middle class Western culture and experiences, it obviously lacks validity. The diversity of ways in which validity is assessed is beyond the scope of our present discussion, but always relates to the test's fulfillment of its stated purpose.

It has been estimated that over 2,500 different psychological tests
are currently used in the United States,
and that more than 200 million copies of these tests are marketed annually.

It should be emphasized then, that while tests typically attempt to ascertain the amount of some particular mental or psychological characteristic an individual possesses, the extent to which this objective is in fact reached varies widely from test to test. No available test is perfectly reliable, and certainly none is completely valid. We must not lose sight of the fact that every test, no matter how prestigious, is a fallible measuring tool. It may provide us with useful information we would not otherwise have, but it neither magically nor perfectly reflects a person's inner qualities. It simply gives a basis for more meaningful inferences from behaviour than we would otherwise have.

Controversy Surrounding the Testing Enterprise

While there have always been critics who question the value of tests, in the past three decades a veritable tempest of controversy has arisen over their use. One of the earliest attacks entitled The Tyranny of Testing (Hoffman, 1962) was a lucid and engaging critique of objective testing in the cognitive domain. Hoffman was particularly unhappy with objective items such as multiple choice or true-false whose objectivity he regarded as illusory, residing only in the scoring process. He argued forcefully that these items ignore the quality of the reasoning behind an answer, thus seriously penalizing the more capable student. To support his position, he cited numerous examples of items (all drawn from well-known published tests) which could be interpreted in a variety of different ways. In Hoffman's view, objective tests have a useful but strictly limited place, testing relatively simple factual information well, but achieving greater difficulty largely through increased ambiguity and tapping higher cognitive functioning very slightly, if at all.

If an intelligence test purports to measure innate general reasoning ability,
but more specifically reflects familiarity with middle class Western culture
and experiences, it obviously lacks validity.

Adopting a similar stance, Martin Gross (1962) published a telling indictment of personality testing in a fascinating volume entitled The Brain Watchers. The main target of his attack was the lucrative personnel selection industry in which a wide range of personality tests were being peddled as having almost magical powers to identify the best potential employees. He made the point that frequently the selection process becomes a challenging game in which applicants must identify particular characteristics the tester is looking for, and respond accordingly. The claims made by test-users were elaborate, inflated, and largely unfounded, particularly since faking is known to be a very real phenomenon. The author's own conclusion, in the light of evidence that psychometricians were well aware of the tests' limitations, was not particularly complimentary: "The reticence of these scientist-psychologists has been ably mated to the huzzas and profitable hoopla of their brain-watching colleagues and the slothful ignorance of industry- into a formidable cult that operates only through the grace of many who should know considerably better" (Gross, 1962, p. 275).

In their zeal to point out the weaknesses of standardized tests,
many critics have allowed strong emotion to take precedence over clear thinking.

During the past twenty-five years, a variety of other critics have entered the fray and numerous recurring complaints have surfaced, accompanied by the responses of test supporters. Many of the concerns, such as those regarding unfairness to minority groups, false claims of identifying innate ability, and invasion of privacy, have their primary application in the area of intelligence testing. Lyman (1986) identified eight common complaints, noting elements of both fact and fancy in most of these. For example, it is evident that no test measures innate ability in its pure form, though the degree to which particular experiences will affect performance varies widely from test to test. Further, there is an element of cultural bias in most, perhaps all tests of ability, even if they carry the label "culture-fair." The critics have something important to say, and test-makers would do well to pay attention to them.

However, in their zeal to point out the weaknesses of standardized tests, many critics have allowed strong emotion to take precedence over clear thinking. As a result, their accusations have at times been badly overstated and quite indefensible. Rudman (1982) reviewed a number of commonly made criticisms of standardized testing and attempted to evaluate the data upon which they are based. Though his own commitment to testing undoubtedly biased his interpretations, he made a good point in challenging Hoffman's charge that tests discourage creativity and penalize the better students. He responds: "...tests are treated anthropomorphically; they are given human qualities. They are assigned the ability to group children, determine their future, support children's goals, dampen creative urges, help children become dishonest, and even undermine the very foundations of education" (Rudman, 1982, p. 221). Rudman went on to argue that, in fact, tests do none of these things; rather, the teachers and administrators who make decisions are responsible if these consequences do occur. In other words, the problem is more in the interpretation and use of the test results than in the test itself. This is a significant point to which we shall later return.

The problem is more in the interpretation and use
of the test results than in the test itself.

Reflecting on the range of objections that have been raised concerning the use of standardized tests, what reasonable conclusions can be drawn? First, it should be pointed out that the critics have successfully and legitimately dampened the over-enthusiastic zeal of test producers who in their passion to create and market tests have often made extreme and ill-founded claims on behalf of their favourite instruments. One general and very positive impact of the critics, then, has been to force test publishers to be considerably more modest and realistic in how they present their product. This is clearly illustrated in the area of intelligence testing. Van Leeuwen (1982) made the case that it was evident to the early developers of IQ tests that these instruments were in fact culturally relative, but due to their interest in the eugenics movement, they chose to emphasize the innate nature of mental ability. Claims about the permanence and pervasiveness of intelligence were then made, but the evidence over the years has not supported these claims. In his review of recent conceptualizations of intelligence, McKean (1988) notes that the early view of mental ability as a unitary genetically determined trait is becoming progressively less popular. Current theorists emphasize both the diversity and the cultural variation in the concept of intelligence.

But many of the critics are not satisfied with scaled-down claims of what the tests can do. Some of them demand the total abolition of testing, at least of standardized, objective testing. Active debate on this matter continues to the present time and will certainly not be resolved in a few pages. However, one relevant point should be raised at this juncture. If tests are viewed (as I believe they should be) as sources of input for decision-making, then it is an undeniable fact that whether or not tests are used, decisions will still be made. Universities with limited space must accept some and reject other applicants. Employers must sift through the pile of applicants and decide which ones will be hired. Educational funds must be allotted to some and withheld from other children with special needs. Many of the attacks on testing reflect a naive assumption that if the tests are no longer used, then the plague of inappropriate decisions will be terminated. We need to ponder the question as to whether decisions made in the absence of the information provided by appropriate tests will be more equitable, or whether greater unfairness will in fact occur.

A Preliminary Christian Perspective

Having explored something of the nature and scope of standardized testing, and having briefly considered the evaluation made of it from various quarters, we now reflect on the implications which a Christian world view may have for this facet of psychology. What difference does a uniquely Christian perspective make when we examine standardized tests?

One general and very positive impact of the critics, then, has
been to force test publishers to be considerably more modest and
realistic in how they present their product.

So far as I am able to discern, relatively little has been written in this regard; thus, we are venturing into uncharted territory. One helpful perspective has been provided by Van Leeuwen (1985) in her evaluation of the cognitive movement in which she addressed the concept of intelligence. She finds the whole notion to be heavily biased by a Western emphasis on formal operational thought and by measurement tools which are inherently culture bound. The fundamental concept of the fear of God (closely related in scripture with the idea of wisdom) is missing in our understanding of intelligence, which is therefore probably not very close to the core of what the image of God comprises (p. 174).

The Bible has little either positive or negative-to say about people of low intelligence, but speaks frequently of the "fool" as one who is devoid of moral fibre. Thus, intelligence and wisdom do not seem to be closely related.

The Bible has little-either
positive or negative-to say about
people of low intelligence,
butspeaks frequently of the "fool" as one who is devoid of moral fibre.

While these reflections are valuable, they are exclusive to intelligence, only one of dozens of traits which psychometricians seek to measure. If we were to remove from use all measures of intelligence, the numbers of available standardized tests would be only modestly reduced. What of the remaining hundreds of tests of mechanical ability, dominance, attitudes to authority, and so on? I propose to orgamze some thoughts in this regard in the form of responses to three fundamental questions:

· Is it appropriate to evaluate people?

· Is justice increased or reduced by standardized tests?

· In our use of tests, is our goal to serve or to be served?

Is Evaluation Appropriate?

While it would be possible to interpret Jesus' words in Matthew 7:1-2 as precluding our judging or evaluating one another, the thrust of this passage seems to be directed toward a censorious attitude often inherent in the critical assessments we make of others. Furthermore, the need for evaluation is affirmed in other passages. For example, in I Timothy 3 and Titus 1, criteria are laid down for the selection of elders and deacons. These passages do not specify how these criteria are to be implemented, though an intimate personal acquaintance with the candidate seems to be assumed. The situation, however, is somewhat parallel to a job selection in which tests are currently so often used. I see nothing in these passages to rule out the use of tests, provided they help us implement appropriate criteria more effectively.

If evaluation of other people is not only tolerated but actually in some cases authorized and required, then the merits of using tests to facilitate the process need to be considered. Personnel selection is only s situations in which these sorts of one of numerous selections must be made, and while tests must not ,predetermine our selection criteria, they may, when Property applied, facilitate the application of these criteria. Nor , I believe, does the use of selection tests preclude a recognition of our need for divine wisdom in making such decisions. If God's intentions were revealed in times past through the equivalent of dice, then surely standardized tests can serve that function as well!

Is justice Increased?

There is little disputing that a fundamental theme of scripture is equity and fairness, and that justice is close to the heart of God. Speaking of God in Deuteronomy 32:4, Moses declares: "All His ways are just." The Psalmist frequently appealed to God when he saw unfairness around him (e.g., Psalm 82:2), confirming his belief in God's justice. Furthermore, our essential human duty is described as the responsibility "to act justly, to love mercy and to walk humbly with your God" (Micah 6:8). Clearly, then, we cooperate with the purposes of God when our activities enhance the process of fairness in human relationships.

The question begging an answer is, "Do standardized tests contribute to the achievement of justice?" Obviously, when tests used for selection purposes are biased against people of certain races, social strata, or cultural background in ways that do not relate to successful performance of the task at hand, they are inappropriate. To discern whether or not this is occurring is of course difficult, but clearly the issue of test validity surfaces as a very critical one. Tests of moderate or low validity should be either dropped or used in such a way as to weight their influence in accordance with their limited validity, making way for consideration of other factors of greater potential value. Furthermore, the criteria in place for the determination of test validity need to be closely examined in the light of the goal of equity.

If God's intentions were revealed in times past
through the equivalent of dice,
then surely standardized tests can serve that function as well.

It is perhaps appropriate here to add a comment in support of standardized tests, a point often overlooked by zealous critics of testing. Given that people are prone to bias and often consciously prejudiced against specific subgroups, we need to consider the possibility that tests may in fact increase justice because they are constructed with deliberate intent to avoid elements of unfairness to minority groups. As Novick contends, "the proper use of well-constructed and validated tests provides a far better basis for making decisions about individuals and programs than would otherwise be available" (Novick, 1984, p. 15). While not all tests are well constructed, and many are improperly used, it seems likely that judicious application of the better ones will lead to decisions that are more equitable than those based on the personal judgement of people who may have difficulty suspending their own values and views. In both the development and the selection of tests for all uses, the minimization of bias and the achievement of justice need to be given priority. When a test does not contribute to the realization of these goals, it should be eliminated.

Is Our Goal to Serve?

One of the recurring concerns articulated by critics of tests is that tests and their creators often wield too much power (Hoffman, 1982). They become masters rather than servants. There is no doubt that testing is a lucrative business, estimated to produce revenues of $60 million annually (Weiner & Stewart, 1984, p. 183). Its economic implications, for the companies involved, if not for society as a whole, are very substantial. Consequently, in their enthusiasm to market a successful product, test developers are inclined to paint an overly rosy picture of what their particular instrument is capable of doing. Likewise, test users whose motivation is one of efficiency and economic advantage are likely to employ tests in a self-serving way with little regard for the welfare of candidates or applicants involved.

It is in this context that I believe a crucial Christian distinction arises. The point which Farnsworth (1985) made with regard to an orientation to service rather that to personal advantage in the applied area of counselling has, equal if not greater relevance to the whol of standardized testing. In all of our creating, selecting, and using of these instruments, the organizational efficiency and economic gain be balanced by a genuine concern for the welfare of the persons being tested. This will have two implications. First, considerable care will be taken in communicating both the results and the limitations of the testing process to clients so that they can gain useful self-understanding experience. These discussions will assist the test taker to know how best to invest his/her unique abilities and characteristics, as well as to organization to more effectively deploy its human resources. Secondly, the goal of economic advantage alone will be an insufficient basis for administering a test. We will establish as a minimum criterion for test adoption the requirement that some benefit accrue to the individual as well as to the organization.

Concluding Reflections

It is perhaps by now evident that t does not see as a viable resolution to the surrounding the use of standardized tests the proposal that we simply abolish them. Such a suggestion is rooted in the naive assumption problems run no deeper than the test instruments themselves. A standardized test is a tool; relatively neutral in itself when appropriately constructed, but with potential for both use and abuse.

We need to consider the possibility that tests
may increase justice because they are constructed
with deliberate intent to avoid elements of unfairness to minority groups.

Perhaps one of the reasons that tests often been misused is that in this context there is danger rather than safety in numbers. As has been aptly pointed out (Shelley & Cohen, 1986), when we attempt to describe human traits in terms of numbers, two problems tend to arise. First, quantification may usurp the original goal of accurate description and become an end in itself. Secondly, when results are reported in terms of numbers, a deceptive aura of precision surrounds them. This causes us to place more credence in the results than may be warranted, and to forget more quickly that the test score is only an approximation of the individual's true score on that trait. We need to remind ourselves frequently of what tests can and cannot do.

This author does not see as a viable resolution to the problems surrounding the use of standardized tests the proposal that we simply abolish them.

Armed with this word of caution however, we must also avoid the other extreme of blaming tests unduly. Jenifer (1984), in making the distinction between the messenger (the test) and the message (the result obtained by a particular individual), wisely reminds us that we should not blame the messenger when the message is an unwelcome one, provided it is an accurate reflection of reality. This of course brings us right back to validity once again. Tests which have been demonstrated to fulfill their stated purpose can and should be used to the degree warranted by their validity, with appropriate interpretative caution.

The key to beneficial use of standardized tests is to be aware of both their capabilities and their limitations, thus avoiding overinterpretation of the results. In this regard, the conclusion stated by Ravitch (1984), though directed mainly to achievement testing, is well worth quoting:

In sum, there can be no doubt that the tests have their uses as well as their misuses. The standardized test must always be seen as a measuring device, an assessment tool, never as an end in itself. @e skills that it measures are important, but it does not measure every important skill. The information that it gives us about the state of a student's lean-dng is never definitive but only tentative and subject to future change. Above all, we should not permit the standardized test to become the be-all and endall of educational endeavour; we send our children to school not to do well on tests but to become educated people, knowledgeable about the past and the present and prepared to continue leaming in the future. (Ravitch, 1984, p. 67)

If we keep this perspective in mind, we can appreciate standardized tests but not be awed by them. We can use them rather than misuse them, and they will serve us rather than oppress us.

References

Chance, P. (1988). Testing education. Psychology Today, 22(5), 20-21.

Cronbach, L.J. (1984). Essentials of Psychological Testing (4th ed.). New York: Harper & Row, Publishers.

DuBois, P. (1976). A Test-dominated society: China 1115 B.C.1905 A.D. In N.L. Barnette Jr., (ed.), Readings in Psychological Tests and Measurements. Baltimore: The Williams & Wilkins Company-

Farnsworth, K . E. (1985). Furthering the kingdom in psychology. In A.F. Holmes (ed.), The Making of a Christian Mind. Downers Grove, IL: InterVarsity Press.

Gross, M.L. (1962). TheBrain-Watchers. New York: Random House.

Hoffman, B. (1962). The Tyranny of Testing. New York: Collier Books.

Jenifer, F.G. (1984). How test results affect college admissions of minorities. In C.W. Daves (ed.), The Uses and Misuses of Tests. San Francisco: Jossey-Bass Publishers.

Lyman, H.B. (1986). Test Scores and What They Mean (4th ed.). Englewood Cliffs, NJ: Prentice-Hall.

McKean, K. (1988). Intelligence: New ways to measure the wisdom of man. In M.G. Walraven & H.E. Fitzgerald (eds.), Psychology 88189: Annual Editions. Guildford, CT: The Dushkin Publishing Group.

Novick, M.R. (1984). Importance of professional standards for fair and appropriate test use. In C.W. Daves (ed.), The Uses and Misuses of Tests. San Francisco: Jossey-Bass Publishers.

Ravitch, D. (1984). Value of standardized tests in indicating how well students are learning. In C.W. Daves (ed.), The Uses and Misuses of Tests. San Francisco: Jossey-Bass Publishers.

Rudman, H.C. (1982). The standardized test flap. In J. Rubinstein and B.D. Slife (eds.), Taking Sides: Clashing Views on Controversial Psychological Issues. Guilford, CT: The Dushkin Publishing Group Inc.

Shelley, D. & Cohen, D. (1986). Testing Psychological Tests. New York: St. Martin's Press.

Van Leeuwen, M.S. (1982). I.Q.ism and the just society. Journal of the American Scientific Affiliation, 34, 193-201.

Van Leeuwen, M.S. (1985). The Person in Psychology (ch. 8). Grand Rapids, MI: W.B. Eerdmans Publishing Company.

Weiner, E.A. & Stewart B.J. (1984). Assessing Individuals: Psychological and Educational Tests and Measurements. Toronto: Little, Brown & Company.

Divine folly is wiser than human wisdom, and divine weakness stronger than human strength.

-from I Corinthians 1