Skip to main content

Looking at Vector Space and Language Models for IR Using Density Matrices

  • Conference paper
  • First Online:
Quantum Interaction (QI 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8369))

Included in the following conference series:

Abstract

In this work, we conduct a joint analysis of both Vector Space and Language Models for IR using the mathematical framework of Quantum Theory. We shed light on how both models allocate the space of density matrices. A density matrix is shown to be a general representational tool capable of leveraging capabilities of both VSM and LM representations thus paving the way for a new generation of retrieval models. We analyze the possible implications suggested by our findings.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The Dirac notation establishes that \(|u\rangle \) denotes a unit norm vector in \(\mathbb {H}^n\) and \(\langle u| \) its conjugate transpose.

  2. 2.

    In a more general formulation of the theory, a quantum probability measure reduces to a classical probability measure for any set \(\mathcal {M} = \{M_i\}\) of positive operators \(M_i\) such that \(\sum _i M_i = I_n\). The set \(\mathcal {M}\) is called Positive-Operator Valued Measure (POVM) [12]. Therefore, the properties reported in this paper which apply to a complete set of mutually orthogonal projectors equally hold for a general POVM.

  3. 3.

    In general, the dyads in the mixture don’t need to be orthogonal. However, in this case, the coefficients \(\upsilon _i\) cannot be easily interpreted as the probabilities assigned by the density matrix to each dyad.

  4. 4.

    In quantum physics, the meaning of i.i.d. can be associated to the physical notion of measurement. If a density matrix \(\rho \) represents the state of a system, an i.i.d. set of \(m\) quantum events is obtained by performing a measurement on \(m\) different copies of \(\rho \) and by recording the outcomes.

  5. 5.

    In this paper, we do not explicitly take into account situations in which the vectors could contain negative entries. For example, this could easily happen after the application of Rocchio’s algorithm [16] in feedback situations or by reducing the dimensionality of the vector space by LSI [3]. Besides the historically encountered difficulties in the interpretation of such negative entries [6], in these particular cases, the rank equivalence situations discussed here could not hold. However, we argue that ignoring these situations causes no harm to the generality of our conclusions on the need of an enlarged representation space.

  6. 6.

    This is indeed the practice of Query Expansion (QE), see for example [2].

  7. 7.

    In [8], each basis of a vector space is considered as describing a contextual property and the vectors in the basis as contextual factors. We prefer not to adopt such interpretation for two reasons: (1) in this paper, classical sample spaces are exclusively associated to orthonormal basis and (2) we believe that referring to concepts leads to a more general formulation, better tailored to our needs.

References

  1. Birkhoff, G., Von Neumann, J.: The logic of quantum mechanics. Ann. Math. 37(4), 823 (1936)

    Article  Google Scholar 

  2. Carpineto, C., Romano, G.: A survey of automatic query expansion in information retrieval. ACM Comput. Surv. 44(1), 1:1–1:50 (2012)

    Article  Google Scholar 

  3. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41, 391–407 (1990)

    Article  Google Scholar 

  4. Gao, J., Nie, J.Y., Wu, G., Cao, G.: Dependence language model for information retrieval. In: Proceedings of SIGIR, pp. 170–177 (2004)

    Google Scholar 

  5. Gleason, A.: Measures on the closed subspaces of a hilbert space. J. Math. Mech. 6, 885–893 (1957)

    MATH  MathSciNet  Google Scholar 

  6. Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1), 177–196 (2001)

    Article  MATH  Google Scholar 

  7. Lvovsky, A.I.: Iterative maximum-likelihood reconstruction in quantum homodyne tomography. J. Opt. B: Quant. Semiclassical Opt. 6, S556–S559 (2004)

    Article  Google Scholar 

  8. Melucci, M.: A basis for information retrieval in context. ACM Trans. Inf. Syst. 26, 14:1–14:41 (2008)

    Article  Google Scholar 

  9. Melucci, M.: An investigation of quantum interference in information retrieval. In: Cunningham, H., Hanbury, A., Rüger, S. (eds.) IRFC 2010. LNCS, vol. 6107, pp. 136–151. Springer, Heidelberg (2010)

    Google Scholar 

  10. Melucci, M.: Deriving a quantum information retrieval basis. The Computer Journal 56(11), 1279–1291 (2013). doi:10.1093/comjnl/bxs095

    Article  Google Scholar 

  11. Melucci, M., Rijsbergen, K.: Quantum mechanics and information retrieval. Adv. Top. Inf. Retrieval 33, 125–155 (2011). (Springer, Berlin)

    Article  Google Scholar 

  12. Nielsen, M.A., Chuang, I.L.: Quantum Computation and Quantum Information. Cambridge University Press, Cambridge (2004)

    Google Scholar 

  13. Piwowarski, B., Frommholz, I., Lalmas, M., van Rijsbergen, K.: What can quantum theory bring to information retrieval. In: Proceedings of CIKM, pp. 59–68 (2010)

    Google Scholar 

  14. Piwowarski, B., Amini, M.R., Lalmas, M.: On using a quantum physics formalism for multidocument summarization. JASIST 63(5), 865–888 (2012)

    Article  Google Scholar 

  15. van Rijsbergen, K.: The Geometry of Information Retrieval. Cambridge University Press, Cambridge (2004)

    Book  MATH  Google Scholar 

  16. Rocchio, J.: Relevance feedback in information retrieval. In: Salton, G. (ed.) The SMART Retrieval System, pp. 313–323. Prentice-Hall, Englewood Cliffs (1971)

    Google Scholar 

  17. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manage. 24(5), 513–523 (1988)

    Article  Google Scholar 

  18. Song, D., Lalmas, M., van Rijsbergen, K., Frommholz, I., Piwowarski, B., Wang, J., Zhang, P., Zuccon, G., Bruza, P., Arafat, S., Azzopardi, L., Huertas-Rosero, A., Hou, Y., Melucci, M., Rüger, S.: How quantum theory is developing the field of information retrieval. In: Proceedings of QI, pp. 105–108 (2010)

    Google Scholar 

  19. Sordoni, A., Nie, J.-Y., Bengio, Y.: Modeling term dependencies with quantum language models for IR. In: Proceedings of SIGIR, pp. 653–662 (2013)

    Google Scholar 

  20. Sordoni, A., He, J., Nie, J.-Y.: Modeling latent topic interactions with quantum interference for IR. To appear in Proceedings of CIKM (2013)

    Google Scholar 

  21. Tsuda, K., Ratsch, G., Warmuth, M.K.: Matrix exponentiated gradient updates for on-line learning and bregman projection. J. Mach. Learn. Res. 6(1), 995 (2006)

    MathSciNet  Google Scholar 

  22. Warmuth, M.K., Kuzmin, D.: Bayesian generalized probability calculus for density matrices. Mach. Learn. 78(1–2), 63–101 (2009)

    MathSciNet  Google Scholar 

  23. Widdows, D., Peters, S.: Quantum logic of word meanings: Concept lattices in vector space models (2003)

    Google Scholar 

  24. Wong, S.K.M., Yao, Y.Y.: On modeling information retrieval with probabilistic inference. ACM Trans. Inf. Syst. 13(1), 38–68 (1995)

    Article  MathSciNet  Google Scholar 

  25. Zhai, ChX: Statistical language models for information retrieval a critical review. Found. Trends Inf. Retr. 2(3), 137–213 (2007)

    Article  Google Scholar 

  26. Zhao, X., Zhang, P., Song, D., Hou, Y.: A novel re-ranking approach inspired by quantum measurement. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 721–724. Springer, Heidelberg (2011)

    Google Scholar 

  27. Zobel, J., Moffat, A.: Exploring the similarity space. SIGIR Forum 32(1), 18–34 (1998)

    Article  Google Scholar 

  28. Zuccon, G., Azzopardi, L.: Using the quantum probability ranking principle to rank interdependent documents. In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 357–369. Springer, Heidelberg (2010)

    Google Scholar 

  29. Zuccon, G., Piwowarski, B., Azzopardi, L.: On the use of complex numbers in quantum models for information retrieval. In: Amati, G., Crestani, F. (eds.) ICTIR 2011. LNCS, vol. 6931, pp. 346–350. Springer, Heidelberg (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Alessandro Sordoni or Jian-Yun Nie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sordoni, A., Nie, JY. (2014). Looking at Vector Space and Language Models for IR Using Density Matrices. In: Atmanspacher, H., Haven, E., Kitto, K., Raine, D. (eds) Quantum Interaction. QI 2013. Lecture Notes in Computer Science(), vol 8369. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54943-4_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-54943-4_13

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-54942-7

  • Online ISBN: 978-3-642-54943-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics