Abstract
A pure significance test would check the agreement of a statistical model with the observed data even when no alternative model was available. The paper proposes the use of a modified p-value to make such a test. The model will be rejected if something surprising is observed (relative to what else might have been observed). It is shown that the relation between this measure of surprise (the s-value) and the surprise indices of Weaver and Good is similar to the relationship between a p-value, a corresponding odds-ratio, and a logit or log-odds statistic. The s-value is always larger than the corresponding p-value, and is not uniformly distributed. Difficulties with the whole approach are discussed.
Similar content being viewed by others
References
Bayarri, M. J., & Berger, J. O. (1999). Quantifying surprise in the data and model verification (with discussion). In J. M. Bernardo, J. O. Berger, A. P. Dawid, & A. F. M. Smith (Eds.), Bayesian statistics (Vol. 6, pp. 53–82). London: Oxford University Press.
Bayarri, M. J., & Berger, J. O. (2000). P values for composite null models (with Robins et al. (2000) and discussion). Journal of the American Statistical Association, 95, 1127–1172.
Church, A. (1940). On the concept of a random sequence. Bulletin of the American Mathematical Society, 46, 130–135.
Dawid, A. P., & Vovk, V. (1999). Prequential probability: Principles and properties. Bernoulli, 5, 125–162.
Evans, M. (1997). Bayesian inference procedures derived via the concept of relative surprise. Communications in Statistics, 26, 1125–1143.
Evans, M., Guttman, I., & Swartz, T. (2006). Optimality computations for relative surprise inferences. Canadian Journal of Statistics, 34, 113–129.
Good, I. J. (1954). The appropriate mathematical tools for describing and measuring uncertainty. In: C. F. Carter, G. P. Meredith, & G. L. S. Sheckle (Eds.), Uncertainty and business decisions. Liverpool: University Press.
Good, I. J. (1956). The surprise index for the multivariate normal distribution. Annals of Mathematical Statistics, 27, 1130–1135.
Good, I. J. (1988). Surprise index. In: S. Kotz, N. L. Johnson, & C. B. Reid (Eds.), Encyclopaedia of statistical sciences (Vol. 7). New York: Wiley.
Howard, J. V. (1975). Computable explanations. Zeitschrift fur Mathematische Logik und Grundlagen der Mathematik, 21, 215–224.
Jahn, R. G., Dunne, B. J., & Nelson, R. D. (1987). Engineering anomalies research. Journal of Scientific Exploration, 1, 21–50.
Jeffreys, H. (1939, 1961). Theory of probability. Oxford: Oxford University Press.
Jefferys, W. H. (1990). Bayesian analysis of random event generator data. Journal of Scientific Exploration, 4, 153–169.
Lindley, D. V. (1957). A statistical paradox (with discussion). Biometrika, 44, 187–192.
Lindley, D. V. (1977). A problem in forensic science. Biometrika, 64, 207–213.
Martin-Löf, P. (1966). The definition of random sequences. Information and Control, 9, 602–619.
Robins, J. M., van der Vaart, A., & Ventura, V. (2000). Asymptotic distribution of P values for composite null models (with Bayarri and Berger (2000) and discussion). Journal of the American Statistical Association, 95, 1127–1172.
Seillier-Moiseiwitsch, F., Sweeting, T. J., & Dawid, A. P. (1992). Prequential tests of model fit. Scandinavian Journal of Statistics, 19, 45–60.
Seillier-Moiseiwitsch, F., & Dawid, A. P. (1993). On testing the validity of sequential probability forecasts. Journal of the American Statistical Association, 88, 355–359.
Shafer, G. (1982). Lindley’s paradox (with discussion). Journal of the American Statistical Association, 77, 325–351.
Shafer, G., & Vovk, V. G. (2001). Probability and finance: It’s only a game. New York: Wiley.
Ville, J. (1939). Étude critique de la notion de collectif. Paris: Gauthier-Villars.
von Mises, R. (1919). Grundlagen der Wahrscheinkeitstheorie. Mathematische Zeitschrift, 5, 52–99.
Weaver, W. (1948). Probability, rarity, interest and surprise. Scientific Monthly, 67, 390–392.
Acknowledgements
I wish to thank the referees and editors for many helpful comments and suggestions, which have led to major improvements in this paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Howard, J.V. Significance Testing with No Alternative Hypothesis: A Measure of Surprise. Erkenn 70, 253–270 (2009). https://doi.org/10.1007/s10670-008-9148-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10670-008-9148-4