Non-commutative Logic for Compositional Distributional Semantics

Cvetko-Vah, Karin; Sadrzadeh, Mehrnoosh; Kartsaklis, Dimitri; Blundell, Benjamin

doi:10.1007/978-3-662-55386-2_8

Karin Cvetko-Vah¹⁵,
Mehrnoosh Sadrzadeh¹⁶,
Dimitri Kartsaklis¹⁶ &
…
Benjamin Blundell¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10388))

Included in the following conference series:

International Workshop on Logic, Language, Information, and Computation

487 Accesses

Abstract

Distributional models of natural language use vectors to provide a contextual foundation for meaning representation. These models rely on large quantities of real data, such as corpora of documents, and have found applications in natural language tasks, such as word similarity, disambiguation, indexing, and search. Compositional distributional models extend the distributional ones from words to phrases and sentences. Logical operators are usually treated as noise by these models and no systematic treatment is provided so far. In this paper, we show how skew lattices and their encoding in upper triangular matrices provide a logical foundation for compositional distributional models. In this setting, one can model commutative as well as non-commutative logical operations of conjunction and disjunction. We provide theoretical foundations, a case study, and experimental results for an entailment task on real data.

K. Cvetko-Vah acknowledges the financial support from the Slovenian Research Agency (research core funding No. P1-0222). M. Sadrzadeh, D. Kartsaklis and B. Blundell acknowledge financial support from AFOSR International Scientific Collaboration Grant FA9550-14-1-0079.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abramsky, S., Coecke, B.: A categorical semantics of quantum protocols. In: Proceedings of the 19th Annual IEEE Symposium on Logic in Computer Science (LiCS 2004). IEEE Computer Science Press (2004). arXiv:quant-ph/0402130
Berendsen, J., Jansen, D.N., Schmaltz, J., Vaandrager, F.W.: The axiomatization of override and update. J. Appl. Log. 8, 141–150 (2010)
Article MathSciNet MATH Google Scholar
Chomsky, N.: Three models for the description of language. IRE Trans. Inf. Theory 2, 113–124 (1956)
Article MATH Google Scholar
Coecke, B., Sadrzadeh, M., Clark, S.: Mathematical foundations for distributed compositional model of meaning. Lambek Festschr. Linguist. Anal. 36, 345–384 (2010)
Google Scholar
Cvetko-Vah, K.: Skew lattices of matrices in rings. Algebra Univers. 53, 471–479 (2005)
Article MathSciNet MATH Google Scholar
Cvetko-Vah, K., Salibra, A.: The connection of skew Boolean algebras and discriminator varieties to church algebras. Algebra Univers. 73, 369–390 (2015)
Article MathSciNet MATH Google Scholar
Cvetko-Vah, K., Leech, J., Spinks, M.: Skew lattices and binary operations on functions. J. Appl. Log. 11, 253–265 (2013)
Article MathSciNet MATH Google Scholar
Firth, J.: A synopsis of linguistic theory 1930–1955. In: Studies in Linguistic Analysis (1957)
Google Scholar
Galatos, N., Jipsen, P., Kowalski, T., Ono, H.: Residuated Lattices: An Algebraic Glimpse at Substructural Logics. Studies in Logic and the Foundations of Mathematics, vol. 151. Elsevier, Amsterdam (2007)
MATH Google Scholar
Harris, Z.: Distributional structure. Word 10, 146–162 (1954)
Article Google Scholar
Jordan, P.: Über nichtkommutative verbände. Arch. Math. 2, 56–59 (1949)
Article MathSciNet MATH Google Scholar
Kartsaklis, D., Sadrzadeh, M.: A compositional distributional inclusion hypothesis. In: Amblard, M., de Groote, P., Pogodalla, S., Retoré, C. (eds.) LACL 2016. LNCS, vol. 10054, pp. 116–133. Springer, Heidelberg (2016). doi:10.1007/978-3-662-53826-5_8
Chapter Google Scholar
Kartsaklis, D., Sadrzadeh, M.: Distributional inclusion hypothesis for tensor-based composition. In: COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, Osaka, Japan, 11–16 December 2016, pp. 2849–2860. ACL (2016)
Google Scholar
Kotlerman, L., Dagan, I., Szpektor, I., Zhitomirsky-Geffet, M.: Directional distributional similarity for lexical inference. Nat. Lang. Eng. 16(4), 359–389 (2010)
Article Google Scholar
Lambek, J.: Type grammar revisited. In: Lecomte, A., Lamarche, F., Perrier, G. (eds.) LACL 1997. LNCS, vol. 1582, pp. 1–27. Springer, Heidelberg (1999). doi:10.1007/3-540-48975-4_1
Chapter Google Scholar
Lambek, J.: The mathematics of sentence structure. Am. Math. Mon. 65, 154–170 (1958)
Article MathSciNet MATH Google Scholar
Leech, J.: Skew lattices in rings. Algebra Univers. 26, 48–72 (1989)
Article MathSciNet MATH Google Scholar
Leech, J.: Skew Boolean algebras. Algebra Univers. 27, 497–506 (1990)
Article MathSciNet MATH Google Scholar
Leech, J.: Normal skew lattices. Semigroup Forum 44, 1–8 (1992)
Article MathSciNet MATH Google Scholar
Leech, J.: Recent developments in the theory of skew lattices. Algebra Univers. 52, 7–24 (1996)
MathSciNet MATH Google Scholar
Lin, D.: Automatic retrieval and clustering of similar words. In: Proceedings of the 17th International Conference on Computational Linguistics, vol. 2, pp. 768–774. Association for Computational Linguistics (1998)
Google Scholar
Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the International Conference on Machine Learning, pp. 296–304 (1998)
Google Scholar
Rubenstein, H., Goodenough, J.: Contextual correlates of synonymy. Commun. ACM 8(10), 627–633 (1965)
Article Google Scholar
Schuetze, H.: Automatic word sense discrimination. Comput. Linguist. 24(1), 97–123 (1998)
Google Scholar
Weeds, J., Weir, D., McCarthy, D.: Characterising measures of lexical distributional similarity. In: Proceedings of the 20th International Conference on Computational Linguistics, COLING 2004. Association for Computational Linguistics (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Mathematics and Physics, University of Ljubljana, Ljubljana, Slovenia
Karin Cvetko-Vah
School of Electronic Engineering and Computer Science, Queen Mary University of London, London, UK
Mehrnoosh Sadrzadeh & Dimitri Kartsaklis
ITS Research, Queen Mary University of London, London, UK
Benjamin Blundell

Authors

Karin Cvetko-Vah
View author publications
You can also search for this author in PubMed Google Scholar
Mehrnoosh Sadrzadeh
View author publications
You can also search for this author in PubMed Google Scholar
Dimitri Kartsaklis
View author publications
You can also search for this author in PubMed Google Scholar
Benjamin Blundell
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Karin Cvetko-Vah .

Editor information

Editors and Affiliations

Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
Juliette Kennedy
Centro de Informática, Recife, Pernambuco, Brazil
Ruy J.G.B. de Queiroz

A Appendix

1.1 A.1 Normalisation Schemes

The raw co-occurrence counts are normalised using two measures:

Probability Ratio
$$\begin{aligned} \frac{P(w,f)}{P(w)P(f)} \end{aligned}$$
where P(w, c) is the probability that words w and feature f have occurred together, and P(w) and P(f) are probabilities of occurrences of w and f. This measure tells us how often w and f were observed together in comparison to how often they would have occurred were they independent.
Positive Pointwise Mutual Information (PPMI)
$$\begin{aligned} \max (log(\frac{P(w,f)}{P(w)P(f)}), 0) \end{aligned}$$
This is the positive version of the logarithm of probability ratio, where the negative logarithmic values are sent to 0.

1.2 A.2 Formulae for Computing Entailment

APinc is the average precision applied to feature inclusion. It measures a ranked version of feature inclusion on vectors $\overrightarrow{u}$ and $\overrightarrow{v}$, from highest to lowest:

$$\begin{aligned} \textit{APinc}(u,v) = \frac{\sum _r \left[ P(r) \cdot \textit{rel}'(f_r)\right] }{|F(\overrightarrow{u})|} \end{aligned}$$

(1)

In the above, $f_r$ is the feature in $\overrightarrow{u}$, denoted by $F(\overrightarrow{u})$, with rank r; P(r) is the precision at rank r, which measures how many of $\overrightarrow{v}$’s features are included at rank r in the features of $\overrightarrow{u}$, and $\textit{rel}'(f_r)$ is a relevance measure reflecting how important $f_r$ is in $\overrightarrow{v}$. It is computed as follows:

$$\begin{aligned} rel'(f) = \left\{ \begin{array}{lr} 1-\frac{\textit{rank}(f,F(\overrightarrow{v}))}{|F(\overrightarrow{v})|+1} &{} f \in F(\overrightarrow{v}) \\ 0 &{} o.w. \end{array}\right. \end{aligned}$$

(2)

BAPinc balances APinc with the LIN degree of similarity between the vectors. BAPinc was developed in [14] after realising that APinc returns poor results when the vectors had a radically different number of non-zero features; the LIN measure was included to balance out the extra dimensions of the longer vector.

$$\begin{aligned} \textit{BAPinc}(u,v) = \sqrt{\textit{LIN}(u,v) \cdot \textit{APinc}(u,v)} \end{aligned}$$

(3)

LIN is a similarity measure between vectors and was defined in [22]. It can be replaced with any other similarity measure, such as the cosine measure.

SAPinc is a measure developed in [12], based on BAPinc, but for dense vectors. Whereas APinc and BAPinc were developed to compute the degree of entailment between word vectors, which are usually sparse since word vectors live in high dimensional spaces (e.g. 5000), SAPinc was developed to deal with phrase and sentence vectors. These are obtained by composing the vectors of words in lower dimension (e.g. 300), where the compositional operators accumulate the information and return dense results.

$$\begin{aligned} \textit{SAPinc}(u,v) = \frac{\sum _r \left[ P(r) \cdot \textit{rel}'(f_r)\right] }{|\overrightarrow{u}|} \end{aligned}$$

(4)

Here, P(r) and $rel'(f_r)$ are defined differently, as shown below:

$$\begin{aligned} P(r) = \frac{\big |\{ f_r^{(u)} | f_r^{(u)} \le f_r^{(v)}, 0 < r \le |\overrightarrow{u}| \}\big |}{r} \end{aligned}$$

(5)

$$\begin{aligned} rel'(f_r) = \left\{ \begin{array}{lr} 1 &{} f_r^{(u)} \le f_r^{(v)} \\ 0 &{} o.w. \end{array} \right. \end{aligned}$$

(6)

For more explanations on these measures please see [12, 13].

1.3 A.3 Experimental Results for a Second Sample

The results of the experiment of Sect. 6, with PPMI and probability ratio matrices on the second 1000 sample of the dataset are presented in Fig. 7.

Similar to the results presented in the paper, the non-commutative operation performs better on recognising the non-commutative conjunctive entailments.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cvetko-Vah, K., Sadrzadeh, M., Kartsaklis, D., Blundell, B. (2017). Non-commutative Logic for Compositional Distributional Semantics. In: Kennedy, J., de Queiroz, R. (eds) Logic, Language, Information, and Computation. WoLLIC 2017. Lecture Notes in Computer Science(), vol 10388. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-55386-2_8

Download citation

DOI: https://doi.org/10.1007/978-3-662-55386-2_8
Published: 29 June 2017
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-55385-5
Online ISBN: 978-3-662-55386-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Non-commutative Logic for Compositional Distributional Semantics

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

A Appendix

1.1 A.1 Normalisation Schemes

1.2 A.2 Formulae for Computing Entailment

1.3 A.3 Experimental Results for a Second Sample

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation