Science
in Christian
Perspective
Validity of Existing Controlled
Studies Examining the Psychological Effects of Abortion
Numerous studies have been concerned with the potential psychological sequelae (potential psychological risks) of abortion but conclusions reached are inconsistent. This paper is based on a comprehensive review of studies addressing the question of post-abortion psychological sequelae. Controlled studies were categorized according to research design and then systematically examined for experimental validity. Poor use of methodology and research design surfaced as an explanation for differing conclusions across the literature. As a further means of examining the integrity of comparisons in the literature made between woman having and not having abortions, the maximum likely statistical power was calculated for each controlled study. As a whole, the literature exhibited grossly substandard power characteristics. An effort to isolate the best study to date was made, and a summary of the conclusions from this study is presented. We conclude that the question of psychological sequelae to abortion is not closed.
In this paper we do not wish to address questions surrounding the morality of abortion. Rather, we want to provide a review of the psychological sequelae literature aimed at determining the scientific merit of existing studies. Certainly it would be reprehensible to overstate or understate a scientifically validated finding for a "higher" moral cause. Likewise, it would be reprehensible to pass on as "scientific" the claims of studies that exhibit little experimental validity.
To determine the level of rigor that exists in the psychological sequelae literature, we have undertaken a review of this literature from a methodological and statistical perspective. To locate articles we have relied upon computer searches of Index Medicus, Psychological Abstracts, Science Citation Index and the National Institute of Mental Health data base in addition to examination of the bibliographies of all articles located. This search yielded over
300 studies; seventy-six were either clinical case studies or experimental research. In turn, these studies were organized into four categories according to research design: case studies (17), controlled studies (14), retrospective-uncontrolled studies (20) and prospective-uncontrolled studies (23). Each of these four types of research designs have strengths and weaknesses, some of which will be described below.Unfortunately, there are many inconsistencies in the conclusions drawn by the authors of the studies we located. For example, Wallerstein, Kurtz and Bar-Din
(1972) found adverse reactions in fifty percent of the cases studied, while Osofsky and Osofsky (1972), in a study published the same year, concluded that there were few, if any, adverse psychological reactions. When results are this varied, both pro-life and prochoice camps are able to find "evidence" to support their position. Under such circumstances, the need to consider the methodological and statistical practices underpinning each study becomes self-evident. A review of the foundations on which the literature rests sometimes can differentiate between studies that can be trusted and those that come to unwarranted conclusions. If severe methodological flaws in the current literature do exist, these inadequacies, not the conclusions reached, should be the focus of attention. Thus, it is the experimental validity rather than the conclusions of existing studies that provide the focus of this paper. Conceptualizing ValidityJames L. Rogers received his Ph.D. in psychology from Northwestern University, Evanston, Illinois. He presently holds appointments as associate professor at Wheaton College, Wheaton, IL and research associate at the Northwestern University Medical School. James F. Phifer is currently pursuing his doctoral degree in clinical psychology at the University of Louisville, Louisville, KY. Julie A. Nelson received her M.A. in psychological studies from the Wheaton College Graduate School, Wheaton, IL and currently is in private practice.
measure of psychological sequelae). Internal validity refers to the extent to which the observed effects of the outcome variable (psychological sequelae) may be attributed to the treatment (abortion) rather than alternative causes (age, marital status, religious background, etc.). Construct validity pertains to the extent that the outcome measures, treatments, samples and settings utilized in the research represent the theoretical constructs of interest. In the present context, high construct validity would imply (among other things) that the measuring device used to assess risk was reliable and accurate. Finally, external validity refers to the validity with which conclusions can be generalized to and across populations of persons, settings and time. Having high external validity would mean that the conclusions about abortion and psychological risk found in a given study could be safely applied to women other than those actually involved in the study.
A review of the
foundations on which the literature rests sometimes can differentiate between
studies that can be trusted and those that come to unwarranted conclusions.
We will first discuss statistical conclusion validity as it relates to the post-abortion sequelae literature. There are two types of errors one can make when using a statistical hypothesis test to decide whether an experimental group differs from a control group (i.e., an abortion group differs from a non-abortion control group). Type I error refers to concluding from sample data that there is a difference on the outcome variable (i.e., incidence of psychological trauma) when such is not really the case for the two comparison populations. In effect, you have drawn random samples that look different, but both samples have come from the same population (with regard to the outcome parameter of interest). On the other hand, a Type II error occurs when, on the basis of sample data, it is decided that the samples have come from the same population when really each is from a different population.
Ideally we want to carry out hypothesis tests with a low probability of Type I error (e.g., set alpha, the probability of Type I error, at .05 or lower) as well as a low probability of a Type II error (e.g., we want power, the probability of correctly accepting the alternative hypothesis, to be .95 or higher). indeed, both types of error can simultaneously be held to a low probability of occurrence if there are sufficient resources to collect adequately large comparison samples.
In reality it is Often too expensive, time consuming or otherwise difficult to collect sample sizes that will allow one to sufficiently protect against both types of errors. Also, investigators without adequate statistical background and/or access to statistical consultation may not understand how crucial adequate sample size is, particularly as it relates to the possibility of making a Type II error. In such instances, investigator motivation may be insufficient to overcome barriers that work against securing adequate sample sizes.
When resources or - motivation are insufficient to protect against both a Type I and Type 11 error, the research should not be carried out. But often it is. The very typical course of action is to maintain protection against a Type I error while tolerating a high risk of a Type 11 error. In other words, common practice would have us, in the face of limited resources, defend the null hypothesis at the expense of possibly missing a true alternative hypothesis.
An example from the pharmaceutical industry will clarify the usual practice and why it occurs. Suppose it is considered desirable to take a new drug to market but it is too expensive to test the drug against a control product using a large sample size. Most would argue that it would be better for the pharmaceutical firm to err in the direction of not introducing a new drug (that really is better) than to introduce a new drug (thought to be better but that really is not). The implication would be that alpha be kept small at the cost of decreasing power (i.e., increasing Type 11 error probability). After all, if we falsely conclude that the new drug is better and thus commit a Type I error, society must bear the considerable cost of producing and distributing the new drug only to ultimately discover that it is no better or even worse than the old drug. Protecting from Type I error at the expense of increasing the risk of a Type 11 error may mean that no one gets a new and better drug, but at least we will not replace a time-tested solution with a solution that does not work. As it turns out, Type I errors are usually more costly to society than Type Il errors. Avoiding a Type I error will usually guard the status quo and therefore protect traditional practices and thinking.
It can be argued that under certain circumstances traditional wisdom is on the side of the alternative hypothesis, and to guard it, one must (if resources are limited) increase the risk of a Type I error in order to lower the risk of a Type 11 error. Indeed, it might be argued that this is the case regarding the question of
Table I
Statistical Power
for Fourteen Comparative Studies That Examine the Psychological Sequelae of
Abortion
Sample Size
Harmonic
Relative
Date
Researcher
N na nb
Mean
Power Country
(Data Collected)
David, et
al. 98,612 27,234
71,378
39,426
.99+
Denmark 1974-75
Brewer
7,660 3,550
4,110
3,809
.99+
England 1975-76
Jansson, et al. 30,329 1,773
28,556
3,338
.99+
Sweden 1952-56
Meyerowitz
11l 93
18
30
.12
U.S.A. 1963-69
Selare, et
al.
42 21
21
21
.10
Scotland 1960-68
Hamill, et al.
128 81
47
59
.17
Scotland 1971-72
Greenglass,
126 63
63
63
.17
Canada 1972-73
Niswander,
68 49
19
27
.12
U.S.A. 1971-72
Athanasiou,
114 76
38
51
.16
U.S.A. 1970-72
McCance,
300 192
108
138
.27
Scotland 1967-68
Drower,
157 88
69
77
.19
South Africa 1974-75
Brody,
152 94
58
72
.19
Canada 1968-70
Simon,
78 32
46
38
.13
U.S.A. 1955-63
Todd,
102 81
22
35
.13
Scotland 1968-70
Power values were determined as outlined by Cohen (1977). In accordance with Cohen's guidelines for unequal sample sizes, the abortion and control group sample sizes (n. and nb, respectively) were converted to a single harmonic mean which was used to enter the power tables.
We do not claim the foregoing argument, but we do maintain that when a large number of individuals believe strongly that a difference between experimental and control groups exists, as is the case in this country regarding the sequelae to abortion question, a statistical decision procedure with good power characteristics (i.e., a low risk of a Type 11 error) must be utilized out of respect for these individuals. In a word, those who are against abortion and believe it to increase
psychological sequelae deserve quality studies with good statistical power characteristics. This is true, if for no other reason than that the popular press will label published studies with low statistical power that claim abortion has no psychological effect as "scientific,"and in so doing give them a prestigious status. However, the popular press will not bother to explain, because they will not understand, that there was a good chance of arriving at that conclusion due to limited statistical power, quite aside from whether the conclusion is really true. As scientists who understand these concepts, we have a moral responsibility to make sure that the public is not misled by the absence of "statistically significant" differences in studies with low statistical power.We have just completed an examination of the existing studies
that compare a post-abortion group with a control. After making certain
assumptions, we have calculated the level of statistical power present in each
study. Our conclusion (see Table 1) is that 11 of the 14 existing studies
exhibit statistical power that is not likely to exceed, but could be less than_
27. We hold that the majority of currently available comparative studies exhibit
grossly substandard power characteristics even under assumptions that, if
anything, overestimate power levels.
Mortality becomes a threat when subjects who exhibit certain characteristics of potential importance to the conclusions of a study drop out of one treatment group but not the other. Differential dropout can lead to discrepancies between treatment groups on critical background variables, thus making comparisons at the end of the study impossible to interpret. This is a particularly serious problem in the sequelae to abortion literature due to certain findings reported by Adler (1976). She reviewed 17 studies dealing, to varying degrees, with psychological sequelae, She found sample attrition ranging from 13 percent (Barnes, Cohen, Stockle and McGuire, 1971) to 86 percent (Evans and Gusdon, 1973). In her own study, Adler followed up non-responders and found them most likely to be young, Catholic, and unmarried. Each of these characteristics has been associated with a greater likelihood of negative sequelae (Adler, 1975; Payne, Kruita, Notman, and Anderson, 1976; Osof sky and Osofsky, 1972) ' Adler concluded that experimental mortality may result in the underestimation of the incidence of adverse responses to abortion.
Selection is a threat when, at the outset of the study, subjects assigned to the experimental condition differ from control subjects on baseline characteristics. In this event, differences or similarities between the experimental and control groups found at the end of the study may be due to the treatment (presence or absence of abortion), one or more baseline differences, or the interaction of the treatment with one or more baseline differences. The threat of selection is usually countered by randomly assigning subjects to conditions, but this, as noted earlier, is impossible for abortion sequelae research. If random assignment cannot be used to equate groups at baseline by chance, then one should at least compare baseline characteristics on selected variables to rule out possible important differences.
Selection was indisputably a potential threat to the internal validity of more than 50 percent of the studies we reviewed because baseline measures simply were not collected. Without carefully establishing the baseline comparability of women who receive an abortion to those who do not on at least such rudimentary characteristics as age, number of children, education, socioeconomic status, social support, marital status and phvsical health, the meaning of differences or similarities in the incidence of sequelae will remain speculative.
Those who are against abortion and believe it to increase psychological sequelae deserve quality studies with good statistical power characteristics.
We would like to illustrate some of the difficulties in the way psychological sequelae have been assessed with some examples. Niswander and Patterson (1967), Ewing and Rouse (1973), Kretzschmar and Norris (1967) and Bracken, Hachamovitch and Grossman (1974) devised their own self-report questionnaires to
assess the psychological reaction to abortion. However, in virtually all instances no attempt was made to validate these instruments or even assess their reliability (i.e., consistency and preciseness). A variety of relatively simple methods have been devised for determining reliability (test-retest, parallel forms and split half techniques), but none of these were conducted. Clearly, the use of measuring devices with unknown reliability can potentially distort the conclusions one makes about the psychological impact of abortion.Other studies have implemented structured or unstructured interviews as the assessment measure (Patt, Rappaport and Barglow,
1969; Wallerstein, Kurtz and Bar-Din, 1972; Osofsky and Osofsky, 1972; Ford, Castelnuovo-Tedesco and Long, 1971; Peck and Marcus, 1966). It is common knowledge that psychiatric interviews can be highly unreliable and are subject to the specific orientation, level of expertise, biases and expectations of the interviewer. In virtually all cases reviewed, no attempt was made to assess inter-rater reliability (the degree to which two interviewers come to similar conclusions about the same subject), or to control for interviewer bias and expectancies. For example, Osofsky and Osofsky (1972) attempted to quantify such behaviors as crying and smiling during an unstructured interview. These behaviors could easily be influenced by characteristics of the interviewer, but no attempt was made to control for such factors.Without carefully establishing the baseline comparability of women who receive an abortion to those who do not ... the meaning of differences or similarities in the incidence of sequelae will remain speculative.
In general,
we found little evidence to suggest that construct validity for the dependent
measures used to assess sequelae was at an acceptable level. The list of
potential threats to construct validity we found is too great to enumerate in
this presentation. However, it includes, in addition to the above problems, such
practices as obtaining information concerning the level of emotional adjustment
from sources other than the patient (Meyerowitz, Satloff and Romano, 1971;
Jacobs, Garcia, Rickels
and Preucel, 1974; Pare
and Raven, 1970; Lask,
1974); conducting
follow-up assessment immediately after the abortion in the recovery room (Braken,
Hachamovitch and Grossman, 1974;
Osofsky and Osofsky, 1972;
Moseley, Follinstad,
Harley and Heckel, 1981);
interviewing patients
at unsystematized follow-up intervals ranging from one to five years (Kretzschmar
and Norris, 1967) or
several months to seven years (Meyerowitz, Satloff and Romano, 1971);
and including patients
who not only received an abortion but were also sterilized concomitantly, thus
subjecting the subject to two treatments simultaneously and rendering any form
of causal interpretation impossible.
External Validity
External validity
refers to the ability to generalize findings across populations, settings and
time, and is critical if the information is going to be useful apart from its
experimental setting. However, the majority of existing studies utilize small,
self-selected samples of women who had their abortion at one specific hospital.
Such selection bias would likely limit the generalizability of any conclusions
reached, even if the conclusions were made under conditions of high internal
validity. For example, Niswander and Patterson (1967)
asked the attending
physician to approve or disapprove the mailing of a questionnaire to each of the
patients, thus eliminating those patients of whom it was thought that the
recollection of the abortion experience would be too painful. Abrams, DiBiase
and Sturgis (1979) sent
questionnaires only to those patients whom they felt were likely to respond. In
both of these cases, the subject selection procedure could seriously alter the
generalizability (external validity) of results.
Generalizability of results would be greatly enhanced if
subject selection were stratified across the various settings in which abortions
are performed. Indeed, the distribution of such settings can be approximated. In
1982, 82 percent
of abortions in America were performed in non-hospital facilities: 56 percent in
abortion clinics, 21 percent
in other clinics, and 5 percent in physicians' offices (Henshaw, Forrest and
Blaine, 1984). Eighteen
percent of abortions were performed in hospitals. Unfortunately, no study of
which we are aware has attempted to make the research sample utilized in the
study representative of have been conducted in the current decade, the research
sample utilized in the study representative of these known demographic
characteristics. The distribution of settings for the research sample being used
is often not even specified
A second obstacle to external validity is highlighted
by the widely varying definitions of psychological
sequelae that are used across the various studies in the
area. In one respect, the search for abortion related
sequelae of many different kinds enhances generaliz
ability. However, to the degree that our confidence in
findings is lessened because results of studies that use
different definitions of sequelae are difficult to pool,
generalizability is retarded. This may contribute to the
inconsistencies found among results in the literature.
Some studies define negative psychological reactions to
abortion in terms of psychological symptornatology
such as depression, anxiety or guilt. Another may attach
importance to the number of symptoms, while others
rely on the subjective experience of the woman as she
reports it in a self-report questionnaire. The resulting
ambiguities make the literature difficult to summarize
as there are no subgroups of studies that consistently
measure the same dependent variable defined in the
same way. It therefore goes without saying that the
literature contains few replications of procedures or
findings. Given small sample sizes and virtually no
replication across investigators, the potential for non
generalizable (not to mention unreliable) conclusions is
substantial.
Clearly, the use of measuring devices with unknown reliability can potentially distort the conclusions one makes about the psychological impact of abortion.
Lastly,
generalizability across time is a crucial issue.
Approximately half of the studies we reviewed were
conducted from 1967 to
1973 when
abortion laws were
being liberalized. During this period, therapeutic abor
tions were granted on medical and/or psychiatric
grounds. The remaining studies were conducted in the
mid-to-late 1970's under
abortion-on-dernand. (Note
that some of these studies were not published until the
early 1980's). It
is highly questionable as to whether
conclusions drawn from studies utilizing women
granted abortions on therapeutic grounds only, as was
the case until 1973 in
the United States, are generalizable to the current social milieu characterized by
abortion-on-demand. Furthermore, as no new studies
Generalizability Of results would be greatly enhanced if subject selection were stratified across the various settings in which abortions are performed.
It now should be clear that considerable ambiguity surrounds the question of post-abortion sequelae because numerous methodological problems exist in the literature. In the midst of the confusion arising from generally poor methodology, it is only natural to ask whether some of the existing studies are more trustworthy than others. Certainly when studies of relatively high and low validities conflict, the conclusions of the higher quality studies should be given the most weight. As Mintz (1983) has stated, "literally no number of anecdotal reports, uncontrolled trials or poorly designed experiments can outweigh one carefully planned and executed controlled experiment if it results in clear and divergent findings" (p. 74). On this same issue, Smith, Glass and Miller (1980) write: "The important question in surveying a body of literature is to determine whether the best designed studies yield evidence different from more poorly designed studies. if the answer is yes, then one is compelled to believe the best ones" (p. 64).
Pursuing this line of thought, we would like to critique what we consider to be the "best" study done in this area to date. Danish researchers David, Rasmussen and Holst (1981) have carried out the only study we located that exhibited the minimum criteria of a control group, pretest measures, adequate sample size, an attempt to equate non-equivalent groups at baseline, and assessment tools with adequate validity and reliability. It is our hope that the ensuing critique of this study, which in our opinion is one of the few acceptable studies (but certainly not without problems), will highlight in a concrete way the issues that the clinician and/or woman considering abortion must keep in mind when examining the research.
Utilizing the computer linkage of the Danish national case registry, the above authors studied the
comparative risk of admission to a psychiatric hospital within three months of an abortion or term delivery for all women under age 50 residing in Denmark. Data on admission to psychiatric hospitals was obtained on
71,378 women carrying pregnancies to term, 27,234 women terminating unwanted pregnancies, and on the total population of 1,169,819 women aged 15 to 49. In determining the incidence rates, only first admissions were recorded; women with an admission during the 15 months prior to the delivery or abortion were excluded.Figure IA contrasts women who delivered, women who had abortions, and all women in Denmark aged 15 through
49 on incidence of psychiatric hospitalization. Incidence rates were highest for women who were post-abortion (18.4 per 10,000), next highest for women who were postpartum (12.0 per 10,000), and lowest for all women (7.5 per 10,000). In Figure 1B the incidence rates have been further broken down by

age category. Only in women aged 35 through 49 is there a reversal in the direction found in the composite data. Here, women who delivered evidenced a higher rate of psychiatric hospitalization than women who aborted (22.2 per 10,000 vs. 13.4 per 10,000). It appears that the pregnancy event (birth or abortion) interacts with age; women who are post-abortion are at greater risk except in the age category 35 through 49, where the relationship reverses.
Given small sample sizes and virtually no replication across investigators ' the potential for nongeneralizable (not to mention unreliable) conclusions is substantial.
Incidence of psychiatric hospitalization between postpartum and post-abortion women in each of three marital status categories is depicted in

Figure 1C. Differences across conditions are relatively small for women who were currently married or never were married, but are extreme when considering women who were separated, divorced or widowed (16.9 per 10,000 postpartum vs. 63.8 per 10,000 post-abortion). Apparently, women who have suffered from a separation with their husband also have a more difficult time dealing with the termination of the pregnancy. Lack of an emotional support system may be more prevalent for women who are estranged or whose husbands have died.
Our review of the post-abortion sequelae literature suggests that the majority of studies published in this area are greatly flawed.
Although these findings may seem reasonable to those not acquainted with the post-abortion sequelae literature because they rnirror traditional expectations, it is apparent to anyone who has read this literature that these outcomes stand in stark contrast to conclusions reached by the majority of researchers. The majority of researchers conclude that there is no greater occurrence of post-abortion sequelae than postpartum sequelae. A study representative of this literature was done by the English researcher Brewer (1977) and was published in the prestigious British Medical journal. Brewer places the post-abortion rate at only 3 per 10,000 while the postpartum rate was placed at 17 per 10,000. (See Figure le for a comparison to David, Rasmussen and Hoist). indeed, these findings led Brewer to conclude that " . . . childbirth is more hazardous in psychiatric terms than abortion. . . " (p . 477). However, our analysis indicates that the Danish study by David, Rasmussen and Hoist rests upon a much firmer methodological foundation than does the English study by Brewer.
We would like to delineate some of the problems found in the English study authored by Brewer as an illustration of our concern over poor methodology. First, Brewer relied upon a questionnaire that was sent to psychiatrists in a given British catchment area. Thus, his data depended upon each psychiatrist's memory and/or ability (willingness?) to retrieve records. We know of no reliability or validity coefficients for this questionnaire and have no reason to believe that any were computed. Additionally, the questionnaire was sent to only 25% of the psychiatric consultants in the area. There is no guarantee that these consultants are representative, and indeed Sim and Neisser in their analysis "Post-Abortive Psychoses: A Report from Two Centers" (1979) claim that " . . . the psychiatrist with the greatest responsibility and experience in the area of the assessment and treatment of patients with instability associated with pregnancy did not participate." Brewer also reports that some psychiatric consultants had well defined catchment areas while some had catchment areas that overlapped with those of other psychiatrists. In effect, the result of this overlap was that the denominators in the incidence rates were 11 estimated." All these practices stand in sharp contrast to David, Rasmussen and Holst's use of computer-held data for the entire population of Danish females aged 15 through 49. In addition, the Danish study matches the post-abortion and postpartum conditions on prior incidence of psychiatric admission over the prior 15month period, age, marital status, and parity. No attempt appears to have been made in the English study to equate comparison groups on these or any other factors.
Conclusionresearch area is free from inevitable methodological flaws, but not all research is dealing with such grave decisions as whether or not a pregnancy should be terminated. Our point is that when research is dealing with such a crucial issue as possible psychological risks for post-abortion women, we need to be as rigorous as possible in designing and conducting credible research.
At minimum, the findings of David, Rasmussen, and Holst, with
its differing conclusions from studies evidencing less methodological rigor,
should underscore the importance of readdressing the issue of postabortion
psychological sequelae with better experimental design. Findings reported in
what we consider to be the most reliable study to date are compatible with the
assertion that post-abortion psychological sequelae occur more frequently than
postpartum sequelae. Obviously, it is of considerable importance that other well
planned studies be conducted in an effort to verify the findings reported by
David, Rasmussen and Holst. It is crucial that these studies move beyond
psychiatric hospitalization as an endpoint measurement to include other forms of
emotional sequelae. At minimum, depression should be measured.
Our review of the literature leads us to conclude that the
questions of psychological sequelae to abortion is not closed as many
researchers have stated, but remains to be determined. Although such a
conclusion fails to satisfy the expectations of either those for or against
abortion on demand, it seems to reflect the present state of affairs.
References
Abrams, M., Dibiase, V. and Sturgis, S. (1979). Post-abortion attitudes patterns of birth control. Journal of Family Practice, 9, 593-599.
Adler, N. E. (1975). Emotional responses of women foflowing therapeutic abortion. American Journal of Orthopsychiatry, 45, 446-454.
Adler, N. E. (1976). Sample attrition in studies of psychosocial sequelae of abortion: How great a problem. Journal of Applied Social Psychology, 6, 240-259.
American Psychiatric Association (1968). Diagnostic and Statistical Manual of Mental Disorders. Washington: APA.
American Psychiatric Association (1980). Diagnostic and Statistical Manual of Mental disorders. Washington: APA.
Athanisiou, R., Oppel, W., Michelson, L., Unger, T. and Yager, M. (1973). Psychiatric sequelae to term birth and induced early and late abortion: A longitudinal study. Family Planning Perspectives, 5, 227-231.
Barnes, A. B., Cohen, E., Stoekle, J. D. and McGuire, M. T. (1971). Therapeutic abortion: Medical and social sequelae. Annals of Internal Medicine, 75, 881-886.
Bracken, M. B., Hachamovitch, M. and Grossman, G. (1974). The decision to abort and psychological sequelae. Journal of Nervous and Mental Disease, 158,154-162.
Brody, H., Meikle, S. and Gerritse, R. (1971). Therapeutic abortion: A prospective study. 1. American Journal of obstetrics and Gynecology, 109,347-353.
Brewer, C. (1977). incidence of post-abortion psychosis: A prospective study. British Medical journal, 6059, 476-477.
Campbell,
D. T. and Stanley, J. C. (1963). Experimental
and QuasiExperimental Designs for Research. Chicago:
Rand McNally College Publishing
Company.
Cohen, J. (1977). Statistical Power Analysis for the
Behavioral Sciences. New York: Academic Press, Inc.
Cook, T. D. and Campbell, D. T. (1979). Quasi-Experimentation: Design and Analysis Issues for Field Settings. Chicago: Rand McNally College Publishing Company,
David, H. P., Rasmussen, N. K, and Holst, E. (1981). Postpartum and postabortion psychotic reactions. Family Planning Perspectives, 13, 8892.
Derogatis, L. R. (1977). The SCL-90 Manual I: Scoring Administration and Procedures for the SCL-90. Baltimore, Md.: John Hopkins School of Medicine, Clinical Psychometrics Unit.
Drower, S. J. and Nash, E. S. (1978). Therapeutic abortion on psychiatric grounds. South African Medical journal, 54, 604-608.
Evans, D., and Gusdon, J. (1973). Post-abOTtion attitudes. North Carolina Medical journal, 34, 271-273.
Ewing, J. A. and Rouse, B. A. (1973). Therapeutic abortion and a prior psychiatric history. American Journal of Psychiatry, 130, 37-40.
Ford, C. V., Castelnuovo-Tedesco, T. P. and Long, K. D. (1971). Abortion: Is it a therapeutic procedure in psychiatry. Journal of the American Medical Association, 218, 1173-1178.
Greenglass, E. R. (1975). Therapeutic abortion and its psychological implicatons: The Canadian experience. Canadian Medical Association journal, 113,754-757.
Hamill, E. and Ingram, 1. M. (1974). Psychiatric and social factors in the abortion decision. British medical journal, 1, 229-232.
Henshaw, S. K., Forrest, J. D. and Blaine, E. (1984). Abortion services in the United States, 1981-1982. Family Planning Perspectives, 16, 119-127.Hopkins, J., Marcus, M. and Campbell, S. B. (1984). Postpartum depression: A critical review. Psychological Bulletin, 9 5(3), 498-315.
Jacobs, D., Garcia, C. R., Rickels, K. and Preucel, R. W. (1974). A prospective study on the pscyhological effects of therapeutic abortion. Comprehensive Psychiatry, 15, 423-434.
Jansson, B. (1965). Mental disorders after abortion. Acta Psychiatrica Scandinavica, 41, 87-110.
Kretzschmar, R. M. and Norris, A. S. (1967). Psychiatric implications of therapeutic abortion. American Journal of Obstetrics and Gynecology, 98, 368-373.
Lask, B. (1975). Short-term psychiatric sequelae to therapeutic termination of pregnancy. British Journal of Psychiatry, 126, 173-177.
McCance, C., Olley, P. C. and Edward, V. (1973). Long term psychiatric follow-up. in G. Horobin (ed.), Experience t vith Abortion. Cambridge: Cambridge University Press, pp. 245-300,
Meyerowitz, S., Satloff, A. and Romano, J. (1971). Induced abortion for psychiatric indication. American Journal of Psychiatry, 127, 1153-1160.
Mintz, J. (1983). Integrating research evidence: A commentary on metaanalysis. Journal of Consulting and Clinical Psychology, 51, 71-75.
Mosely, D. T., Follingstad, D. R., Harley, H. and Heckel, R. V. (1981). Psychological factors that predict reaction to abortion. journal of Clinical Psychology, 37, 276-279.
Niswander, K. and Patterson, R. (1967). Psychological reaction to therapeutic abortion: 1. Subjective patient response. Obstetrics and Gynecology, 29, 702-706.
Osofsky, D. and Osofsky, J. (1972). The psychological reaction of patients to legalized abortion. American Journal of Orthopsychiatry, 42, 48-60.
Pare, C. M. and Raven, H. (1970). Follow-up of patients referred for termination of pregnancy. Lancet, 1, 653-638.
Patt, S. L., Rappaport, R. G. and Barglow, P. (1969). Follow-up of therapeurt abortion. Archives of General Psychiatry, 20, 408-414.
Payne, E. C., Kravitz, A. R., Notman, M. T. and Anderson, J. V. (1976) Outcome following therapeutic abortion. Archives of General Psychiam 33,725-733.Peck, A. and Marcus, H. (1966). Psychiatric sequelae of the therapeu-~_ interruption of pregnancy. Journal of Nervous and Mental Disease, 14417-425.
Radloff, L. (1977). The CES-D scale: A self-report depression scale for resea= in the general population. Journal of the Applied Psychological Measurt ment, 1, 385-401.
ScLwv, A. B. and Geraghty, B. P. (1971). Therapeutic abortion: A follow-u: study. Scottish Medical journal, 16, 438-442.
Sim, M. and R. Neisser (1979). Post-abortive psychoses: A report from t. centers. In D. Mall & W. F. Watts (eds.), The Psychological Aspects :Abortion. Washington, D. C.: University Publications of America, Inc., pc 1-13.
Simon, N. M., Rothman, D., Goff, J. T. and Senturia, A. G. (1969). Psycholmcal factors related to spontaneous and therapeutic abortion. AnWTOCI Journal of Obstetrics and Gynecology, 104, 799-808.
Smith, M., Glass, G. and Miller, T. (1980). The Benefits of Psychotherap-t Baltimore, Md.: John Hopkins Press.
Spitzer, R. L., Endicott, J. and Robins, E, (1978). Research diagnostic critenk. Archives of General Psychiatry, 35, 837-844.
Todd, N. A. (1971). Psychiatric experience of the abortion act (1967). BritisJournal of Psychiatry, 119, 489-495.
Wallerstein, S., Kurtz, P. and Bar-Din, M. (1972). Psychological sequelae c, therapeutic abortion in young unmarried women. Archives of Genera Psychiatry, 27, 828-832.
Discovering
the truth about the emotional impact of either clinical case studies or
experimental research. In
abortion should be of great interest to all. Unfortunate- turn, these studies
were organized into four categories
ly, representatives of both sides of the abortion debate according to research
design: case studies (17), con
often exercise a high degree of selectivity in their trolled studies (14),
retrospective-uncontrolled studies
review of the psychological sequelae literature, publi- (20) and
prospective-uncontrolled studies (23). Each of
cizing only findings that support their position on the these four types of
research designs have strengths and
matter. This is unfortunate because when thoughtfully weaknesses, some of which
will be described below.
approached, it becomes evident that the question of
possible sequelae to abortion exists apart from the ethics Unfortunately, there
are many inconsistencies in the
of the action. This is true for two reasons. First, doing conclusions drawn by
the authors of the studies we
what is "right" or "wrong" may or may not result in located.
For example, Wallerstein, Kurtz and Bar-Din
changes in the emotional state. For example, evangeli- (1972) found adverse
reactions in fifty percent of the
cal Christians base the correctness of an action on their cases studied, while
Osofsky and Osofsky (1972), in a
interpretation of the Scripture. Relative to a directive study published the
same year, concluded that there
or principle found in Scripture, an emotional reaction were few, if any, adverse
psychological reactions.
or the absence of the same in women who have had When results are this varied,
both pro-life and pro
abortions is of little consequence in providing moral choice camps are able to
find "evidence" to support
guidance. Second, one key issue in determining the their position. Under such
circumstances, the need to
morality of abortion is the question of the "rights of the consider the
methodological and statistical practices
unborn." A woman's psychological reaction to abortion underpinning each
study becomes self-evident. A
offers little direction concerning whether or not these review of the
foundations on which the literature rests
rights have been violated. sometimes can differentiate between studies that can
be trusted and those that come to unwarranted conclu
In this paper we do not wish to address questions sions. If severe
methodological flaws in the current
surrounding the morality of abortion. Rather, we want literature do exist, these
inadequacies, not the conclu
to provide a review of the psychological sequelae sions reached, should be the
focus of attention. Thus, it
literature aimed at determining the scientific merit of is the experimental
validity rather than the conclusions
existing studies. Certainly it would be reprehensible to of existing studies
that provide the focus of this paper.
overstate or understate a scientifically validated find
ing for a "higher" moral cause. Likewise, it would be
reprehensible to pass on as "scientific" the claims of Conceptualizing Validity
studies that exhibit little experimental validity. We elected to adopt a
conceptualization of experi
mental validity proposed by Cook and Campbell (1979)
To determine the level of rigor that exists in the to help systematically
determine how seriously the
psychological sequelae literature, we have undertaken conclusions of a given
study should be taken. Experi
a review of this literature from a methodological and mental validity can be
categorized into four different
statistical perspective. To locate articles we have relied types: statistical
conclusion validity, internal validity,
upon computer searches of Index Medicus, Psychologi- construct validity and
external validity. Statistical con
cal Abstracts, Science Citation Index and the National clusion validity is
concerned with the extent to which a
Institute of Mental Health data base in addition to study permits valid
inference about covariation
examination of the bibliographies of all articles located. between the
independent variable (the presence or
This search yielded over 300 studies; seventy-six were absence of abortion) and
the dependent variable (some
Table I
Statistical Power for Fourteen Comparative Studies That Examine the Psychological Sequelae of Abortion