Science in Christian Perspective



The Role of Statistics in the Scientific Method
California Institute of Technology

From: JASA 7 (September 1955): 30-31.

The theory and methodology of statistics is a complex and technical subject. Thus in order to give a non-technical account it will be necessary to limit the discussion to only the most general aspects of the subject.

Now the scientific method consists in formulating a physical hypothesis and then testing the hypothesis by means of a physical experiment. The problem, then, is to determine when and how statistics enter into these procedures. Let its look more closely at the process of testing the hypothesis. It consists in carrying out the experiment and then interpreting the results. Furthermore, the interpretation involves making some decision concerning the hypothesis under consideration. But any physical experiment involves some kind of random error, even perhaps the human errors associated with reading the position of a pointer on a scale. For even the most accurate measuring device can only be read to a limited number of significant figures. Thus the decision concerning the hypothesis must take into account these chance effects. But chance effects can only be treated by means of the laws of probability and hence the decision concerning the hypothesis is indeed of a statistical nature. It will thus express a probability judgment. However it will not be a probability statement concerning the truth or falsity of the hypothesis. For probabilities are associated with events rather than statements. Let us examine, then, how the laws of probability can be used to gain information about the hypothesis from the results of the experiment.

For simplicity, let us suppose that there are only two possible alternatives in the physical situation we are contemplating and that the hypothesis asserts that under the conditions of the experiment a particular one of these alternatives must hold. We shall further assume, for the moment, that the laws which govern the chance effects are known. Then under the supposition that the hypothesis is true, namely, that a particular one of the alternatives holds, the probability that the observed outcome of the experiment would occur can be computed. This gives a number Pi which is not less than zero and not greater than one. Similarly, under the supposition that the hypothesis is not true, namely, that the other alternative holds, the probability that the observed outcome of the experiment would occur can be computed giving a second number P2. Clearly if Pi is close to one while P2 is close to zero, which means that the observed outcome of the experiment is likely to occur if the hypothesis is true but is unlikely to occur if the hypothesis is false, we would be inclined to conclude that the experiment supports the hypothesis. On the other hand, if Pi is close to zero while P2 is close to one, we would be inclined to conclude that the experiment supports the contrary hypothesis. Finally if Pi and P2 are nearly equal then we must conclude that the experiment is inconclusive as far as the hypothesis is concerned. In any case, however, the numbers Pi and P2) give numerical measures of the bearing of the experiment upon the hypothesis.

It should be noted that the traditional, non-statistical method of interpretation which z.ccepted or rejected the hypothesis depending upon the outcome of the experiment represents a sort of limiting form of the above procedure. Namely, if the experiment is such that if the hypothesis is tnie, then almost surely the outcome of the experiment will give Pi very close to one and P2 very close to zero, while if the hypothesis is not true, then the outcome will almost surely give PT very close to zero and P2 very close to one, the traditional method of interpretation does indeed agree with the one outlined above. In the more exact sciences (physics and chemistry) it was true that until recent years most experiments were of this form and it was not necessary to use the more elaborate statistical techniques. Thus Dr. Robert A. Millikan used to reply when urged to use statistical methods that "a really good physicist designs and carries out his experiment in such a careful way that statistical methods are not required." In recent years when the fundamental problems in these sciences have become less readily accessible to experimenta! investigation, such a statement no longer holds. For frequently statisical methods must be employed to obtain significant information from even the most elaborate and carefully designed experiment. Furthermore ' in biology and related fields where there are usually a large number of factors affecting the outcome of the experiments many of which it was impossible to control, it became clear very early that a statisical analysis was necessary before valid conclusions could be drawn.

In order to illustrate the statistical method, let us consider a particular example. A manufacturer of artillery shells has reason to suspect that a defective batch of fuses has been incorporated in a certain lot of shells. He knows that if the fuses are good, the chances are 999 in 1000 that the shells will explode, while if the fuses are defective the chances are only 1 in 2 that the shells will explode. He makes the hypothesis that the fuses are defective and performs the experiment of firing 10 shells. All 10 shells explode, What does he conclude concerning his hypothesis? Following the above procedure, a simple probability calculation shows that if the hypothesis is true, namely if the fuses are defective, then the probability that all 10 shells will explode is (1/2)10 or approximately 1 in 1000. On the other hand, if the hypothesis is not true, namely if the fuses are good, another simple calculation shows that the probability that all 10 shells will explode is 99 in 100. He will thus conclude that contrary to his expectation, the fuses are indeed good. Moreover, he now knows what risk he is taking when he draws this conclusion. For either he is right and the fuses are good. or he is wrong and an event, namely all 10 shells exploding, has occurred whose probability is 1 in 1000. In this sense he is taking a chance of I in 1000 of being wrong. Clearly information of this latter type cannot be obtained from the traditional method of interpretation.

This example also illustrates the fact that the statistical analysis may also contribute to the formulation of the experiment. A simple analysis along the above lilies might have shown that the manufacturer would have gained sufficient information for his purposes by only firing five shells. Thus the expense of firing the additional five shells could have been saved. Furthermore, other possible experiments could be devised and analysed. The manufacturer would then choose the particular experimental procedure which minimizes the cost and risk. This application of statistics which is called "experimental design" has had a remarkable development during the past decade and is widely used in scientific experimental work.

Before concluding this description of the statistical method, some remarks should be made concerning the means by which knowledge of the laws governing the chance fluctuations present in the experimental procedure is obtained. First of all, it may happen that great many similar experiments under similar conditions have already been carried out. If this is the case, the results already obtained can be used to determine the underlying distributions of the chance fluctuations. Secondly, there is the possibility of using statistical tests which are independent of the nature of the underlying distributions. Such tests, which depend upon distribution-free statistics are usually not efficient since they must give valid results even when the most unfavorable distribution of the chance effects happens to be present. Finally, if there are no systematic errors in the experimental system, there is a fundamental theorem of statistics, called the Central Dimit theorem which asserts that the average of a large number of independent measurements will be approximately distributed according to the Gaussian or normal law. Thus in a well designed experiment it is always possible to be in the position of knowing the underlying laws governing the chance fluctuations by simply repeating the measurements a sufficient number of times.

In conclusion, we shall list some of the obvious implications of the inherently statistical nature of scientific knowledge as outlined above. First of all, statistical methods are intimately tied into the actual operations involved in the experimental procedure. For the calculation of the basic probabilities is determined by the details of this procedure. Thus it emphasizes again the basic operational character of scientific knowledge. Next it shows that with any scientific conclusion there is always associated a probability. By performing increasingly elaborate and careful experiments these probabilities can be made to approach certainties, they are still, however, probabilities. In this sense, there is nothing absolute about the conclusions of science. It is quite analogous to the situation in philosophy and theology where meanings of words are never entirely precise. With great effort the precision of the meanings may be greatly increased, but there is still a residue of ambiguity. Finally since statistical methods contribute to the formation of the experimental procedure and afford the means for a valid interpretation of the results, they form indeed the basic framework of the scientific method