Skip to main content
Log in

Significance Testing with No Alternative Hypothesis: A Measure of Surprise

  • Original Article
  • Published:
Erkenntnis Aims and scope Submit manuscript

Abstract

A pure significance test would check the agreement of a statistical model with the observed data even when no alternative model was available. The paper proposes the use of a modified p-value to make such a test. The model will be rejected if something surprising is observed (relative to what else might have been observed). It is shown that the relation between this measure of surprise (the s-value) and the surprise indices of Weaver and Good is similar to the relationship between a p-value, a corresponding odds-ratio, and a logit or log-odds statistic. The s-value is always larger than the corresponding p-value, and is not uniformly distributed. Difficulties with the whole approach are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Bayarri, M. J., & Berger, J. O. (1999). Quantifying surprise in the data and model verification (with discussion). In J. M. Bernardo, J. O. Berger, A. P. Dawid, & A. F. M. Smith (Eds.), Bayesian statistics (Vol. 6, pp. 53–82). London: Oxford University Press.

    Google Scholar 

  • Bayarri, M. J., & Berger, J. O. (2000). P values for composite null models (with Robins et al. (2000) and discussion). Journal of the American Statistical Association, 95, 1127–1172.

    Article  Google Scholar 

  • Church, A. (1940). On the concept of a random sequence. Bulletin of the American Mathematical Society, 46, 130–135.

    Article  Google Scholar 

  • Dawid, A. P., & Vovk, V. (1999). Prequential probability: Principles and properties. Bernoulli, 5, 125–162.

    Article  Google Scholar 

  • Evans, M. (1997). Bayesian inference procedures derived via the concept of relative surprise. Communications in Statistics, 26, 1125–1143.

    Article  Google Scholar 

  • Evans, M., Guttman, I., & Swartz, T. (2006). Optimality computations for relative surprise inferences. Canadian Journal of Statistics, 34, 113–129.

    Article  Google Scholar 

  • Good, I. J. (1954). The appropriate mathematical tools for describing and measuring uncertainty. In: C. F. Carter, G. P. Meredith, & G. L. S. Sheckle (Eds.), Uncertainty and business decisions. Liverpool: University Press.

    Google Scholar 

  • Good, I. J. (1956). The surprise index for the multivariate normal distribution. Annals of Mathematical Statistics, 27, 1130–1135.

    Article  Google Scholar 

  • Good, I. J. (1988). Surprise index. In: S. Kotz, N. L. Johnson, & C. B. Reid (Eds.), Encyclopaedia of statistical sciences (Vol. 7). New York: Wiley.

    Google Scholar 

  • Howard, J. V. (1975). Computable explanations. Zeitschrift fur Mathematische Logik und Grundlagen der Mathematik, 21, 215–224.

    Google Scholar 

  • Jahn, R. G., Dunne, B. J., & Nelson, R. D. (1987). Engineering anomalies research. Journal of Scientific Exploration, 1, 21–50.

    Google Scholar 

  • Jeffreys, H. (1939, 1961). Theory of probability. Oxford: Oxford University Press.

  • Jefferys, W. H. (1990). Bayesian analysis of random event generator data. Journal of Scientific Exploration, 4, 153–169.

    Google Scholar 

  • Lindley, D. V. (1957). A statistical paradox (with discussion). Biometrika, 44, 187–192.

    Google Scholar 

  • Lindley, D. V. (1977). A problem in forensic science. Biometrika, 64, 207–213.

    Article  Google Scholar 

  • Martin-Löf, P. (1966). The definition of random sequences. Information and Control, 9, 602–619.

    Article  Google Scholar 

  • Robins, J. M., van der Vaart, A., & Ventura, V. (2000). Asymptotic distribution of P values for composite null models (with Bayarri and Berger (2000) and discussion). Journal of the American Statistical Association, 95, 1127–1172.

    Article  Google Scholar 

  • Seillier-Moiseiwitsch, F., Sweeting, T. J., & Dawid, A. P. (1992). Prequential tests of model fit. Scandinavian Journal of Statistics, 19, 45–60.

    Google Scholar 

  • Seillier-Moiseiwitsch, F., & Dawid, A. P. (1993). On testing the validity of sequential probability forecasts. Journal of the American Statistical Association, 88, 355–359.

    Article  Google Scholar 

  • Shafer, G. (1982). Lindley’s paradox (with discussion). Journal of the American Statistical Association, 77, 325–351.

    Article  Google Scholar 

  • Shafer, G., & Vovk, V. G. (2001). Probability and finance: It’s only a game. New York: Wiley.

    Book  Google Scholar 

  • Ville, J. (1939). Étude critique de la notion de collectif. Paris: Gauthier-Villars.

    Google Scholar 

  • von Mises, R. (1919). Grundlagen der Wahrscheinkeitstheorie. Mathematische Zeitschrift, 5, 52–99.

    Article  Google Scholar 

  • Weaver, W. (1948). Probability, rarity, interest and surprise. Scientific Monthly, 67, 390–392.

    Google Scholar 

Download references

Acknowledgements

I wish to thank the referees and editors for many helpful comments and suggestions, which have led to major improvements in this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to J. V. Howard.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Howard, J.V. Significance Testing with No Alternative Hypothesis: A Measure of Surprise. Erkenn 70, 253–270 (2009). https://doi.org/10.1007/s10670-008-9148-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10670-008-9148-4

Keywords

Navigation