Skip to main content

Dirichlet-Survival Process: Scalable Inference of Topic-Dependent Diffusion Networks

  • Conference paper
  • First Online:
Advances in Information Retrieval (ECIR 2023)

Abstract

Information spread on networks can be efficiently modeled by considering three features: documents’ content, time of publication relative to other publications, and position of the spreader in the network. Most previous works model up to two of those jointly, or rely on heavily parametric approaches. Building on recent Dirichlet-Point processes literature, we introduce the Houston (Hidden Online User-Topic Network) model, that jointly considers all those features in a non-parametric unsupervised framework. It infers dynamic topic-dependent underlying diffusion networks in a continuous-time setting along with said topics. It is unsupervised; it considers an unlabeled stream of triplets shaped as (time of publication, information’s content, spreading entity) as input data. Online inference is conducted using a sequential Monte-Carlo algorithm that scales linearly with the size of the dataset. Our approach yields consequent improvements over existing baselines on both cluster recovery and subnetworks inference tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/GaelPouxMedard/HOUsToN.

References

  1. Adamic, L.A., Glance, N.: The political blogosphere and the 2004 U.S. election: Divided they blog. In: Proceedings of the 3rd International Workshop on Link Discovery, pp. 36–43. LinkKDD 2005, Association for Computing Machinery, New York, NY, USA (2005)

    Google Scholar 

  2. Albert, R., Barabási, A.L.: Statistical mechanics of complex networks. Rev. Mod. Phys. 74, 47–97 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  3. AlSumait, L., Barbará, D., Domeniconi, C.: On-line lda: adaptive topic models for mining text streams with applications to topic detection and tracking, pp. 3–12 (2008). https://doi.org/10.1109/ICDM.2008.140

  4. Barbieri, N., Manco, G., Ritacco, E.: Survival factorization on diffusion networks. In: Machine Learning and Knowledge Discovery in Databases, pp. 684–700 (2017). https://doi.org/10.1007/978-3-319-71249-9_41

  5. Bassiou, N.K., Kotropoulos, C.L.: Online plsa: Batch updating techniques including out-of-vocabulary words. IEEE Trans. Neural Netw. Learn. Syst. 25(11), 1953–1966 (2014). https://doi.org/10.1109/TNNLS.2014.2299806

    Article  Google Scholar 

  6. Blei, D.M., Lafferty, J.D.: Dynamic topic models. In: Proceedings of the 23rd International Conference on Machine Learning. p. 113–120. ICML 2006, Association for Computing Machinery, New York, NY, USA (2006). https://doi.org/10.1145/1143844.1143859

  7. Choudhari, J., Dasgupta, A., Bhattacharya, I., Bedathur, S.: Discovering topical interactions in text-based cascades using hidden markov hawkes processes, pp. 923–928 (2018). https://doi.org/10.1109/ICDM.2018.00112

  8. Du, N., Song, L., Smola, A., Yuan, M.: Learning networks of heterogeneous influence. In: NIPS, vol. 4, pp. 2780–2788, January 2012

    Google Scholar 

  9. Du, N., Farajtabar, M., Ahmed, A., Smola, A., Song, L.: Dirichlet-hawkes processes with applications to clustering continuous-time document streams. In: 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2015). https://doi.org/10.1145/2783258.2783411

  10. Du, N., Song, L., Woo, H., Zha, H.: Uncover topic-sensitive information diffusion networks. In: Proceedings of the Sixteenth International Conference on Artificial Intelligence and Statistics, AISTATS. JMLR Workshop and Conference Proceedings, vol. 31, pp. 229–237. JMLR.org (2013)

    Google Scholar 

  11. Erdős, P., Rényi, A.: On the evolution of random graphs. In: Publication of The Mathematical Institute of The Hungarian Academy of Sciences, pp. 17–61 (1960)

    Google Scholar 

  12. Gomez-Rodriguez, M., Balduzzi, D., Schölkopf, B.: Uncovering the temporal dynamics of diffusion networks. In: ICML, pp. 561–568 (2011)

    Google Scholar 

  13. Gomez-Rodriguez, M., Leskovec, J., Schoelkopf, B.: Structure and dynamics of information pathways in online media. In: WSDM (2013)

    Google Scholar 

  14. Gomez-Rodriguez, M., Leskovec, J., Schölkopf, B.: Modeling information propagation with survival theory. In: ICML, vol. 28, p. III-666–III-674 (2013)

    Google Scholar 

  15. He, X., Rekatsinas, T., Foulds, J.R., Getoor, L., Liu, Y.: Hawkestopic: a joint model for network inference and topic modeling from text-based cascades. In: ICML (2015)

    Google Scholar 

  16. Larremore, D., Carpenter, M., Ott, E., Restrepo, J.: Statistical properties of avalanches in networks. Phys. Rev. E 85, 066131 (2012). https://doi.org/10.1103/PhysRevE.85.066131

    Article  Google Scholar 

  17. Leskovec, J., Backstrom, L., Kleinberg, J.: Meme-tracking and the dynamics of the news cycle. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 497–506. KDD 2009, Association for Computing Machinery, New York, NY, USA (2009). https://doi.org/10.1145/1557019.1557077

  18. Mavroforakis, C., Valera, I., Gomez-Rodriguez, M.: Modeling the dynamics of learning activity on the web. In: Proceedings of the 26th International Conference on World Wide Web, pp. 1421–1430. WWW 2017 (2017)

    Google Scholar 

  19. Mei, Q., Fang, H., Zhai, C.: A study of poisson query generation model for information retrieval, pp. 319–326 (2007). https://doi.org/10.1145/1277741.1277797

  20. Myers, S.A., Zhu, C., Leskovec, J.: Information diffusion and external influence in networks. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 33–41. KDD 2012, Association for Computing Machinery, New York, NY, USA (2012). https://doi.org/10.1145/2339530.2339540

  21. Nickel, M., Le, M.: Modeling sparse information diffusion at scale via lazy multivariate hawkes processes. In: Proceedings of the Web Conference 2021, pp 706–717. WWW 2021, Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3442381.3450094

  22. Poux-Médard, G., Pastor-Satorras, R., Castellano, C.: Influential spreaders for recurrent epidemics on networks. Phys. Rev. Res. 2, 023332 (2020). https://doi.org/10.1103/PhysRevResearch.2.023332

    Article  Google Scholar 

  23. Poux-Médard, G., Velcin, J., Loudcher, S.: Powered hawkes-dirichlet process: challenging textual clustering using a flexible temporal prior. In: 2021 IEEE International Conference on Data Mining (ICDM), pp. 509–518 (2021)

    Google Scholar 

  24. Poux-Médard, G., Velcin, J., Loudcher, S.: Multivariate powered dirichlet-hawkes process. In: ECIR (2023)

    Google Scholar 

  25. Poux-Médard, G., Velcin, J., Loudcher, S.: Powered dirichlet process for controlling the importance of “rich-get-richer” prior assumptions in bayesian clustering. ArXiv (2021)

    Google Scholar 

  26. Suny, P., Li, J., Mao, Y., Zhang, R., Wang, L.: Inferring multiplex diffusion network via multivariate marked hawkes process. ArXiv abs/1809.07688 (2018)

    Google Scholar 

  27. Tan, X., Rao, V.A., Neville, J.: The Indian buffet hawkes process to model evolving latent influences. In: UAI (2018)

    Google Scholar 

  28. Wang, L., Ermon, S., Hopcroft, J.E.: Feature-enhanced probabilistic models for diffusion network inference. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012. LNCS (LNAI), vol. 7524, pp. 499–514. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33486-3_32

    Chapter  Google Scholar 

  29. Yang, S.H., Zha, H.: Mixture of mutually exciting processes for viral diffusion. In: Dasgupta, S., McAllester, D. (eds.) Proceedings of the 30th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 28, pp. 1–9 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gaël Poux-Médard .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Poux-Médard, G., Velcin, J., Loudcher, S. (2023). Dirichlet-Survival Process: Scalable Inference of Topic-Dependent Diffusion Networks. In: Kamps, J., et al. Advances in Information Retrieval. ECIR 2023. Lecture Notes in Computer Science, vol 13981. Springer, Cham. https://doi.org/10.1007/978-3-031-28238-6_47

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-28238-6_47

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-28237-9

  • Online ISBN: 978-3-031-28238-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics