Revisiting the Stanford Prison Experiment: Overlooking the obvious? (unfinished article)

Carnahan and McFarland (2007) present new results that help to explain the abuse that occurred in the Stanford Prison Experiment (SPE). They found that those volunteering for "a psychological study of prison life" were significantly higher on measures of aggressiveness, authoritarianism, Machiavellianism, narcissism, and social dominance than those who responded to a parallel ad that omitted the words "of prison life," and they were significantly lower in dispositional empathy and altruism. They argue that the abusive behavior occurring in the SPE could have resulted from a self-selection bias. This bias, they argue, could have had effects both on the individual and group level. That is, individuals may have tended toward abusive behavior and when a group of such individuals were assembled they may have "mutually weakened each other’s constraints against abuse and reinforced in each other their willingness to engage in it" (p. 614). 

They also note that both the prisoners’ and guards’ authoritarianism scores in the BBC study increased as the study progressed (Haslam, 2006).

In their introductory remarks, Carnahan and McFarland (2007) comment "When the SPE was conducted in 1971, the situation versus personal disposition debate loomed large. To its authors, the SPE results required a situationist rather than a dispositional explanation (Haney et al., 1973). Because prisoners and guards were assigned randomly to their roles, and because personality measures did not predict behavior in either role (with the exception that five prisoners granted early release due to extreme emotional distress were quite low in authoritarianism), certainly the power of the situation must explain the guards’ cruelty and the prisoners’ passivity and depression" (p. 604).

The mentioned "exception" is reported in the section "Initial personality and attitude measures" of Haney, Banks, and Zimbardo (1973, p. 81). However, this finding is, of course, from the experiment itself. Haney et al. (1973) report that a rank-ordering of prisoners on the F-scale correlated highly with duration of their stay in the experiment (r = .898, p < .005). This is most statistically significant finding in the study. 

By the second day of the study, these "exceptional" prisoners began leaving. Since there were only ten prisoners in the study, half of them left prematurely. We also know that abuse escalated as the study progressed. Thus, the escalation of abuse parallelled the loss of half of the "prisoners" from the experiment. If the Carnahan and McFarland (2007) results are correct, then there was likely a self-selection bias operating not only prior to the study, but also in the study itself. It seems likely that the selection bias would operate more powerfully in the study, than in reading of an ad. This effect is dramatic, in that half of the "prisoners" left and also because it was the most significant finding of the study. If the Carnahan and McFarland (2007) conclusion that there was mutual weakening of constraint against abuse due to group processes of reinforcement, then the removal of the less authoritarian prisoners would have had a disproportionate effect. This would have been further exacerbated by the tendency for authoritarianism to increase as the study progressed (Haslam, 2006). Thus, we have both an increase in individual authoritarianism and removal of those who scored lower on this scale, simultaneous with an escalation in abuse.

The above observation of within-study self-selection would have significantly strengthened the Carnahan and McFarland (2007) argument. However, they apparently failed to notice it, since it was relegated to "Initial personality and attitude measures". Considering that half of the prisoners left the study, saying they represent an exception, is hardly justified. Perhaps if Haney et al. (1973) had given this finding more prominence, Carnahan and McFarland (2007) would have included it in their analysis. Given that the "situation versus personal disposition debate loomed large" (Carnahan and McFarland, 2007, p. 604) at that time, it is indeed odd that the most dramatic finding in the study is only mentioned in passing and that it is not even mentioned in Haney, et al.'s (1973) conclusion.

In their summary of "an interactionist analysis", Carnahan and McFarland (2007) state, "This analysis does not discount the power of a prison simulation, or of a real prison, to induce abusive behavior. The SPE certainly showed that it can do so" (p. 612). Given the observations above and BBC Prison Study (Reicher & Haslam, 2006), which didn't induce abusive behavior, we must question this widely held assumption about what the SPE showed.

The question can be addressed on several levels, and if we are to understand the SPE and the impact it has had, we must go beyond the traditional academic arguments, since it was exceptional in many ways. First, we will look at it from the traditional scientific standpoint and ask what is shown by the results. Next, we will place it in its historical context, like Carnahan and McFarland (2007) have done, but with a wider scope, since this is necessary to understand its origin and impact.

