Skip to main content

Improved Chinese Word Segmentation Disambiguation Model Based on Conditional Random Fields

  • Conference paper
Proceedings of the 4th International Conference on Computer Engineering and Networks

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 355))

Abstract

This paper proposes an improved model that can eliminate the sense ambiguity of Chinese word segmentation based on conditional random fields (CRFs). First, this model segments words based on a bidirectional maximum matching algorithm and extracts the ambiguous part. Then it resolves ambiguity based on the conditional random field algorithm for segmentation ambiguity and outputs a more accurate result for the segmentation. The test results show that this model can reduce the error rate of segmentation caused by the ambiguity of word segmentation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Sun C, Zhang YS. A Chinese word segmentation disambiguation method based on phrase match. Comput Eng Design. 2010;(21):4704–8. (in Chinese)

    Google Scholar 

  2. Mai FJ, Wang T. The word segmentation disambiguation model based on bidirectional maximum matching and HMM. Mod Libr Inf Technol. 2008;(8):37–41. (in Chinese)

    Google Scholar 

  3. Ren H, Lin HF, Yang ZH. The resolution of crossing ambiguities based on smooth maximum entropy model of fusion word characteristics. J Chin Inf. 2010;24(4):18–24 (in Chinese).

    Google Scholar 

  4. Qin Y, Wang XJ, Zhang SX. The research of combinatorial ambiguity in Chinese word segmentation. J Chin Inf. 2007;21(1):3–8 (in Chinese).

    Google Scholar 

  5. Mai FJ, Li DP, Yue XG. The Chinese word segmentation technology research based on the bidirectional matching method and the feature selection algorithms. J Kunming Univ Sci Technol (Nat Sci Ed). 2011;36(1):47–51 (in Chinese).

    Google Scholar 

  6. Yu JD, Fan XZ, Yin JH. Information extraction of Chinese research papers based on conditional random fields. J South China Univ Technol (Nat Sci Ed). 2007;(9):90–4. (in Chinese)

    Google Scholar 

  7. Tu MP. The comparative study of semantic disambiguation based on bayesian classifier and conditional random fields model. Cult Educ Inf. 2011;48(2):121–3 (in Chinese).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fanjin Mai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Mai, F., Wu, S., Cui, T. (2015). Improved Chinese Word Segmentation Disambiguation Model Based on Conditional Random Fields. In: Wong, W. (eds) Proceedings of the 4th International Conference on Computer Engineering and Networks. Lecture Notes in Electrical Engineering, vol 355. Springer, Cham. https://doi.org/10.1007/978-3-319-11104-9_70

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11104-9_70

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11103-2

  • Online ISBN: 978-3-319-11104-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics