Skip to main content

Syntactic Dependency-Based Feature Selection

  • Chapter
  • First Online:
Book cover The Naïve Bayes Model for Unsupervised Word Sense Disambiguation

Part of the book series: SpringerBriefs in Statistics ((BRIEFSSTATIST))

  • 1255 Accesses

Abstract

The feature selection method we are presenting in this chapter makes use of syntactic knowledge provided by dependency relations. Dependency-based feature selection for the Naïve Bayes model is examined and exemplified in the case of adjectives. Performing this type of knowledge-based feature selection places the disambiguation process at the border between unsupervised and knowledge-based techniques. The discussed type of feature selection and corresponding disambiguation method will once again prove that a basic, simple knowledge-lean disambiguation algorithm, hereby represented by the Naïve Bayes model, can perform quite well when provided knowledge in an appropriate way. Our main conclusion will be that the Naïve Bayes model reacts well in the presence of syntactic knowledge of this type and that dependency-based feature selection for the Naïve Bayes model is a reliable alternative to the WordNet-based semantic one.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 29.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 39.95
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Dependency grammar (DG) is a class of syntactic theories developed by Lucien Tesnière (1959). Within this theory, syntactic structure is determined by the grammatical relations existing between a word (a head) and its dependents.

  2. 2.

    See the mathematical model presented in Chap. 2.

  3. 3.

    The relations between the dependent and the head are usually represented by an arch.

  4. 4.

    See also Link grammar (Sleator and Temperley 1991, 1993) and Word grammar (Hudson 1984).

  5. 5.

    For which see Sect. 4.3.1.

  6. 6.

    See the mathematical model presented in Chap. 2.

  7. 7.

    They have only eliminated the potentially unuseful relations—for WSD—provided by the Stanford parser, such as: determiner, predeterminer, numeric determiner, punctuation relations, etc.

  8. 8.

    In what follows, such dependencies will be called first order dependencies.

  9. 9.

    A path anchored at the target word w is a path in the dependency graph starting at w. If the dependency relations have directionality, leading to an associated oriented graph, a path anchored at w is either a path starting at w or arriving at w.

  10. 10.

    In what follows, such dependencies will be called second order dependencies.

  11. 11.

    Which are the same as those showing the distribution of senses of common and public, respectively in Chap. 3.

  12. 12.

    Undertaken from (Hristea and Colhon 2012).

  13. 13.

    Undertaken from (Hristea and Colhon 2012).

  14. 14.

    The considered directionality is from head to dependent.

  15. 15.

    In what follows, these dependencies will be called head-driven dependencies.

  16. 16.

    In what follows, these dependencies will be called dependent-driven dependencies.

  17. 17.

    The case Two head-driven dependencies can be summarized as follows: let us denote the target word by \(A\); collect all words of type \(B\) and \(C\) such that \(B\) is a dependent of \(A\) and \(C\) is a dependent of \(B.\)

  18. 18.

    The case Head-driven dependencies and dependent-driven dependencies can be summarized as follows: let us denote the target word by \(A\); collect all words of type \(B\) and \(C\) such that \(B\) is a dependent of \(A\) and \(B\) is a dependent of \(C.\)

  19. 19.

    The case Two dependent-driven dependencies can be summarized as follows: let us denote the target word by \(A\); collect all words of type \(B\) and \(C\) such that \(A\) is a dependent of \(B\) and \(B\) is a dependent of \(C.\)

  20. 20.

    The case Dependent-driven dependencies and head-driven dependencies can be summarized as follows: let us denote the target word by \(A\); collect all words of type \(B\) and \(C\) such that \(A\) is a dependent of \(B\) and \(C\) is a dependent of \(B.\)

  21. 21.

    This principle, which gives the nominal information priority, while the adjectival information is evaluated strictly within the range allowed by the nominal one, has guided Hristea and Colhon (2012) when choosing the nominal subject relation, for instance. This relation refers to the predicative form of the adjective linked via a copula verb to the noun that the adjective modifies.

  22. 22.

    See the mathematical model presented in Chap. 2.

  23. 23.

    Undertaken from (Hristea and Colhon 2012).

  24. 24.

    Undertaken from (Hristea and Colhon 2012).

  25. 25.

    This is the approach suggested by the first series of performed experiments, which had disregarded the dependency type. Test results have shown (see Sect. 4.3.2) that directionality of the relations counts and that the best disambiguation results are obtained when the target word plays the role of head.

  26. 26.

    For more details concerning how to define accuracy in the case of unsupervised disambiguation, see Sect. 3.4.2 of Chap. 3.

  27. 27.

    Undertaken from (Hristea and Colhon 2012).

  28. 28.

    See the mathematical model presented in Chap. 2.

  29. 29.

    Let us note that accuracy is always higher in the case Two head-driven dependencies than in the case Head-driven dependencies and dependent-driven dependencies, which shows that, in the case of directed first and second order dependencies, it is essential to consider the head role not only of the target word but also of its dependents.

  30. 30.

    See the mathematical model presented in Chap. 2.

  31. 31.

    Disambiguation results are close but slightly inferior in the case of head-driven first and second order dependencies (see Tables 4.5 and 4.6).

  32. 32.

    Which allows the arches denoting the dependency relations to intersect.

  33. 33.

    Which does not allow the arches denoting the dependency relations to intersect, in accordance with the classical dependency linguistic theory.

  34. 34.

    See the mathematical model presented in Chap. 2.

  35. 35.

    In the case of adjective public; see Table 4.8 of Sect. 4.3.2.1.

References

  • Bruce, R., Wiebe, J., Pedersen, T.: The Measure of a Model, CoRR, cmp-lg/9604018 (1996)

    Google Scholar 

  • Chen, P., Bowes, C., Ding, W., Brown, D.: A fully unsupervised word sense disambiguation method using dependency knowledge. In: Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the ACL, pp. 28–36 (2009)

    Google Scholar 

  • de Marneffe, M.C., Manning, C.D.: Stanford typed dependencies manual. Technical Report, Stanford University (2008)

    Google Scholar 

  • de Marneffe, M.C., MacCartney, B., Manning, C.D.: Generating typed dependency parses from phrase structure parses. In: Proceedings of LREC-06, pp. 449–454 (2006)

    Google Scholar 

  • Grefenstette, G.: Explorations in Automatic Thesaurus Discovery. Kluwer Academic Publishers, Dordrecht (1994)

    Book  MATH  Google Scholar 

  • Hristea, F.: Recent advances concerning the usage of the Naïve Bayes model in unsupervised word sense disambiguation. Int. Rev. Comput. Softw. 4(1), 58–67 (2009)

    Google Scholar 

  • Hristea, F., Colhon, M.: Feeding syntactic versus semantic knowledge to a knowledge-lean unsupervised word sense disambiguation algorithm with an underlying Naïve Bayes model. Fundam. Inform. 119(1), 61–86 (2012)

    Google Scholar 

  • Hristea, F., Popescu, M.: Adjective sense disambiguation at the border between unsupervised and knowledge-based techniques. Fundam. Inform. 91(3–4), 547–562 (2009)

    MATH  MathSciNet  Google Scholar 

  • Hristea, F., Popescu, M., Dumitrescu, M.: Performing word sense disambiguation at the border between unsupervised and knowledge-based techniques. Artif. Intell. Rev. 30(1), 67–86 (2008)

    Google Scholar 

  • Hudson, R.A.: Word Grammar. Blackwell, Oxford (1984)

    Google Scholar 

  • Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: Proceedings of the 41st Meeting of the Association for Computational Linguistics (ACL 2003), pp. 423–430 (2003)

    Google Scholar 

  • Lee, L.: Measures of distributional similarity. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pp. 25–32 (1999)

    Google Scholar 

  • Levin, B.: English Verb Classes and Alternations: A Preliminary Investigation. University of Chicago Press, Chicago (1993)

    Google Scholar 

  • Lin, D.: Automatic retrieval and clustering of similar words. In: Proceedings of the Joint Annual Meeting of the Association for Computational Linguistics and International Conference on Computational Linguistics, pp. 768–774 (1998)

    Google Scholar 

  • Năstase, V.: Unsupervised all-words word sense disambiguation with grammatical dependencies. In: Proceedings of the Third International Joint Conference on Natural Language Processing, pp. 757–762 (2008)

    Google Scholar 

  • Padó, S., Lapata, M.: Dependency-based construction of semantic space models. Comput. Linguist. 33(2), 161–199 (2007)

    Article  MATH  Google Scholar 

  • Ponzetto, S.P., Navigli, R.: Knowledge-rich word sense disambiguation rivaling supervised systems. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL 2010), Uppsala, Sweden, ACL Press, pp. 1522–1531 (2010)

    Google Scholar 

  • Sleator, D., Temperley, D.: Parsing English with a link grammar. Technical Report CMU-CS-91-196, Carnegie Mellon University, Pittsburgh, PA (1991)

    Google Scholar 

  • Sleator, D., Temperley, D.: Parsing English with a link grammar. In: Proceedings of the Third International Workshop on Parsing Technologies (IWPT93), pp. 277–292 (1993)

    Google Scholar 

  • Tesnière, L.: Eléments de syntaxe structurale. Klincksieck, Paris (1959)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Florentina T. Hristea .

Rights and permissions

Reprints and permissions

Copyright information

© 2013 The Author(s)

About this chapter

Cite this chapter

Hristea, F.T. (2013). Syntactic Dependency-Based Feature Selection. In: The Naïve Bayes Model for Unsupervised Word Sense Disambiguation. SpringerBriefs in Statistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33693-5_4

Download citation

Publish with us

Policies and ethics