Abstract
This paper proposes an improved model that can eliminate the sense ambiguity of Chinese word segmentation based on conditional random fields (CRFs). First, this model segments words based on a bidirectional maximum matching algorithm and extracts the ambiguous part. Then it resolves ambiguity based on the conditional random field algorithm for segmentation ambiguity and outputs a more accurate result for the segmentation. The test results show that this model can reduce the error rate of segmentation caused by the ambiguity of word segmentation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Sun C, Zhang YS. A Chinese word segmentation disambiguation method based on phrase match. Comput Eng Design. 2010;(21):4704–8. (in Chinese)
Mai FJ, Wang T. The word segmentation disambiguation model based on bidirectional maximum matching and HMM. Mod Libr Inf Technol. 2008;(8):37–41. (in Chinese)
Ren H, Lin HF, Yang ZH. The resolution of crossing ambiguities based on smooth maximum entropy model of fusion word characteristics. J Chin Inf. 2010;24(4):18–24 (in Chinese).
Qin Y, Wang XJ, Zhang SX. The research of combinatorial ambiguity in Chinese word segmentation. J Chin Inf. 2007;21(1):3–8 (in Chinese).
Mai FJ, Li DP, Yue XG. The Chinese word segmentation technology research based on the bidirectional matching method and the feature selection algorithms. J Kunming Univ Sci Technol (Nat Sci Ed). 2011;36(1):47–51 (in Chinese).
Yu JD, Fan XZ, Yin JH. Information extraction of Chinese research papers based on conditional random fields. J South China Univ Technol (Nat Sci Ed). 2007;(9):90–4. (in Chinese)
Tu MP. The comparative study of semantic disambiguation based on bayesian classifier and conditional random fields model. Cult Educ Inf. 2011;48(2):121–3 (in Chinese).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Mai, F., Wu, S., Cui, T. (2015). Improved Chinese Word Segmentation Disambiguation Model Based on Conditional Random Fields. In: Wong, W. (eds) Proceedings of the 4th International Conference on Computer Engineering and Networks. Lecture Notes in Electrical Engineering, vol 355. Springer, Cham. https://doi.org/10.1007/978-3-319-11104-9_70
Download citation
DOI: https://doi.org/10.1007/978-3-319-11104-9_70
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11103-2
Online ISBN: 978-3-319-11104-9
eBook Packages: EngineeringEngineering (R0)