Skip to main content
Log in

Optimising newspaper sales using neural-Bayesian technology

  • Original Article
  • Published:
Neural Computing & Applications Aims and scope Submit manuscript

Abstract

We describe a software system, called just enough delivery (JED), for optimising single-copy newspaper sales, based on a combination of neural and Bayesian technology. The prediction model is a huge feedforward neural network, in which each output corresponds to the sales prediction for a single outlet. Input-to-hidden weights are shared between outlets. The hidden-to-output weights are specific to each outlet, but linked through the introduction of priors. All weights and hyperparameters can be inferred using (empirical) Bayesian inference. The system has been tested on data for several different newspapers and magazines. Consistent performance improvements of 1 to 3% more sales with the same total amount of deliveries have been obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.

Similar content being viewed by others

References

  1. Ragg T, Menzel W, Baum W and Wigbers M (2002) Bayesian learning for sales rate prediction for thousands of retailers. Neurocomputing 43:127–144

    Article  MATH  Google Scholar 

  2. MacKay D (1995) Probable networks and plausible predictions—a review of practical Bayesian methods for supervised neural networks. Network 6:469–505

    Article  MATH  Google Scholar 

  3. Heskes T (1998) Solving a huge number of similar tasks: a combination of multi-task learning and a hierarchical Bayesian approach. In: Proceedings of the International Conference on Machine Learning, Morgan Kaufmann, San Mateo, CA

  4. Heskes T (2000) Empirical Bayes for learning to learn. In: Langley P (ed) Proceedings of the Seventeenth International Conference on Machine Learning, Morgan Kaufmann, San Francisco, CA

  5. Bakker B, Heskes T (2003) Task clustering and gating for Bayesian multitask learning. J Mach Learn Res 4:83–99

    Google Scholar 

  6. Caruana R (1997) Multitask learning. Mach Learn 28:41–75

    Article  Google Scholar 

  7. Baxter J (1997) A Bayesian/information theoretic model of learning to learn via multiple task sampling. Mach Learn 28:7–39

    Article  MATH  Google Scholar 

  8. Cadez I, Ganey S and Smyth P (2000) A general probabilistic framework for clustering individuals. Technical report, University of California, Irvine, CA

  9. Bryk A, Raudenbusch S (1992) Hierarchical linear models. Sage, Newbury Park, UK

  10. Robert C (1994) The Bayesian choice: a decision-theoretic motivation. Springer, Berlin Heidelberg New York

    Google Scholar 

  11. Wolpert D (1993) On the use of evidence in neural networks. In: Hanson S, Cowan J and Giles L (eds) Advances in neural information processing systems 5, Morgan Kaufmann, San Mateo, CA

  12. Dempster A, Laird N and Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J Roy Stat Soc B 39:1–38

    MATH  Google Scholar 

  13. West M, Harrison J (eds) (1977) Bayesian forecasting and dynamic models. Springer, Berlin Heidelberg New York

  14. Gamerman D, Migon H (1993) Dynamic hierarchical models. J Roy Stat Soc B 55:629–642

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tom Heskes.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Heskes, T., Spanjers, JJ., Bakker, B. et al. Optimising newspaper sales using neural-Bayesian technology. Neural Comput & Applic 12, 212–219 (2003). https://doi.org/10.1007/s00521-003-0384-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-003-0384-x

Keywords

Navigation