On The Insider: No Foo Fighters for McCain
Find Articles in:
all
Business
Reference
Technology
News
Sports
Health
Autos
Arts
Home & Garden
advertisement
advertisement

Content provided in partnership with
Thomson / Gale

The Shrinking Filedrawer

Skeptical Inquirer,  May, 2001  by Douglas M. Stokes

On the Validity of Statistical Meta-analyses in Parapsychology

It may be easier to explain parapsychological experiments on the basis of chance than has been previously thought.

There are 86,493,225 ways to pull 12 rabbits out of a hat containing 30 rabbits. This and similar facts have major implications for the validity of the statistical meta-analyses that form much of the present case for the existence of such parapsychological phenomena as ESP and psychokinesis.

The above factoid is just one example of the combinatorial explosion, or the counterintuitively large number of ways that one may select k objects from a set of n objects. For instance, there are more than 635 billion 13-card bridge hands that can be dealt from a 52-card deck.

My attention was drawn to the implications of the combinatorial explosion for parapsychology as I read a recent article in the Journal of the Society for Psychical Research describing an ESP study conducted by Alan Vaughn and Jack Houck (Vaughn and Houck 2000). Vaughn and Houck's experimental data consisted of the ESP-guessing results sent to them by twelve subjects who had used their newly developed intuition-training software. Vaughn and Houck state that the probability that the level of success achieved by these subjects could have occurred through plain luck (i.e., in the absence of ESP) is equal to .00036. As this probability is very small, the authors conclude that their experiment provides statistically significant evidence of an ESP effect.

The twelve subjects who voluntarily contributed data to Vaughn and Houck's study were self-selected from a group of subjects of unknown size who also participated in the experiment, but whose results were never recorded because they were never sent in. As the subjects knew their ESP scores before sending them in, it might reasonably be expected that only those subjects who were excited by the high scores they had attained would submit their results. Thus, it is possible that the entire group of subjects actually scored at chance, and that there would be no evidence for ESP if all the scores were examined rather than only the scores of the twelve subjects who chose to mail their results to Vaughn and Houck.

The odds against the results of the twelve subjects being due to chance are 2,778 to 1 according to Vaughn and Houck's statistical analysis. They argue that, for these results to be ascribed to data selection, the larger group of subjects would have had to consist of 33,333 subjects. (This number is obtained by multiplying 12 by 2,778, the deficiency of 3 subjects being due to rounding error). They further state that they have sold fewer than 1,000 copies of their software, thus implying that the larger group of subjects could have consisted of at most 1,000 subjects. As someone who has taught statistics for over twenty years, I found Vaughn and Houck's estimate to be suspiciously high.

It is true that 33,333 subjects can be divided into 2,778 (actually 2,777.75) disjoint (i.e., nonoverlapping) sets of 12 subjects. But this is not the issue. The issue is rather how many potential sets of 12 subjects could have been chosen from a population of 33,333 potential subjects. The answer is a staggering 3.92 x [10.sup.45] sets. This number is computed from the familiar combinatorial formula C(33333,12) = 33,333!/(12! x 33,321!).

Even if one assumes that only 500 potential subjects existed, the number of possible sets of 12 subjects that may be chosen is 4.46 x [10.sub.23], nearly Avogadro's number (the number of molecules in a mole). Thus, if the 12 subjects with the best scores were to submit their data, one might expect the odds against chance to be more than [10.sup.23] to 1, even in the absence of psi.

Even if only 17 potential subjects existed, there would be 6,188 possible sets of 12 subjects that may be chosen from a set of 17 subjects. Thus, I initially thought that this number of potential subjects would suffice to explain Vaughn and Houck's results, in that if only the 12 subjects with the best scores were to submit their results, one would expect results that would occur only once in 6,188 times. Thus, it seemed that one only had to assume that there were five subjects who took part in the guessing but did not send in their results in order to wipe out the evidence for ESP.

Analyzing Subjects Versus Scores

However, I soon realized that the statistical tests were directed at the improbability of the psi scores, not of the sample of subjects. My next step was therefore to conduct a standard "filedrawer" analysis of the kind that is often used in evaluating the statistical significance of psi research. The filedrawer analysis is directed at determining how many additional subjects (or experiments) with scores averaging exactly at chance would need to be assumed to exist in order to wipe out the statistical significance of a parapsychological study (or series of experiments). In the case of Vaughn and Houck's results, the filedrawer analysis indicated that 39 such additional subjects would be required. But even this number seemed suspiciously high to me. After all, there are more than 158 billion sets of 12 subjects that can be chosen from a set of 51. This seemed like overkill.