Skip to main content
Log in

Text Difficulty Study: Do Machines Behave the Same as Humans Regarding Text Difficulty?

  • Research Article
  • Published:
Machine Intelligence Research Aims and scope Submit manuscript

Abstract

With the emergence of pre-trained models, current neural networks are able to give task performance that is comparable to humans. However, we know little about the fundamental working mechanism of pre-trained models in which we do not know how they approach such performance and how the task is solved by the model. For example, given a task, human learns from easy to hard, whereas the model learns randomly. Undeniably, difficulty-insensitive learning leads to great success in natural language processing (NLP), but little attention has been paid to the effect of text difficulty in NLP. We propose a human learning matching index (HLM Index) to investigate the effect of text difficulty. Experiment results show: 1) LSTM gives more human-like learning behavior than BERT. Additionally, UID-SuperLinear gives the best evaluation of text difficulty among four text difficulty criteria. Among nine tasks, some tasks’ performance is related to text difficulty, whereas others are not. 2) Model trained on easy data performs best in both easy and medium test data, whereas trained on hard data only performs well on hard test data. 3) Train the model from easy to hard, leading to quicker convergence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. D. H. Hubel, T. N. Wiesel. Receptive fields of single neurones in the cat’s striate cortex. The Journal of Physiology, vol. 148, no. 3, pp. 574–591, 1959. DOI: https://doi.org/10.1113/jphysiol.1959.sp006308.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. S. J. Amendum, K. Conradi, E. Hiebert. Does text complexity matter in the elementary grades? A research synthesis of text difficulty and elementary students’ reading fluency and comprerhension. Educational Psychology Review, vol. 30, no. 1, pp. 121–151, 2018. DOI: https://doi.org/10.1007/s10648-017-9398-2.

    Article  Google Scholar 

  3. H. J. Faulkner, B. A. Levy. How text difficulty and reader skill interact to produce differential reliance on word and content overlap in reading transfer. Journal of Experimental Child Psychology, vol. 58, no. 1, pp. 1–24, 1994. DOI: https://doi.org/10.1006/jecp.1994.1023.

    Article  CAS  PubMed  Google Scholar 

  4. S. A. Crossley, H. S. Yang, D. S. McNamara. What’s so simple about simplified texts? A computational and psycholinguistic investigation of text comprehension and text processing. Reading in a Foreign Language, vol. 26, no. 1, pp. 92–113, 2014.

    Google Scholar 

  5. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, I. Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 6000–6010, 2017.

  6. D. E. Rumelhart, G. E. Hinton, R. J. Williams. Learning representations by back-propagating errors. Nature, vol. 323, no. 6088, pp. 533–536, 1986. DOI: https://doi.org/10.1038/323533a0.

    Article  ADS  Google Scholar 

  7. X. Wang, Y. D. Chen, W. W. Zhu. A survey on curriculum learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 9, pp. 4555–4576, 2022. DOI: https://doi.org/10.1109/TPAMI.2021.3069908.

    PubMed  Google Scholar 

  8. E. A. Platanios, O. Stretcu, G. Neubig, B. Poczos, T. Mitchell. Competence-based curriculum learning for neural machine translation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, USA, pp. 1162–1172, 2019. DOI: https://doi.org/10.18653/v1/N19-1119.

  9. N. Hollenstein, M. Barrett, M. Troendle, F. Bigiolli, N. Langer, C. Zhang. Advancing NLP with cognitive language processing signals, [Online], Available: https://arxiv.org/abs/1904.02682, 2019.

  10. M. Barrett, J. Bingel, N. Hollenstein, M. Rei, A. Søgaard. Sequence classification with human attention. In Proceedings of the 22nd Conference on Computational Natural Language Learning, Brussels, Belgium, pp. 302–312, 2018. DOI: https://doi.org/10.18653/v1/K18-1030.

  11. N. Hollenstein, C. Zhang. Entity recognition at first sight: Improving NER with eye movement information. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, USA, pp. 1–10, 2019. DOI: https://doi.org/10.18653/v1/N19-1001.

  12. N. Hollenstein, F. Pirovano, C. Zhang, L. Jäger, L. Beinborn. Multilingual language models predict human reading behavior. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 106–123, 2021. DOI: https://doi.org/10.18653/v1/2021.naacl-main.10.

  13. N. Hollenstein, A. de la Torre, N. Langer, C. Zhang. CogniVal: A framework for cognitive word embedding evaluation. In Proceedings of the 23rd Conference on Computational Natural Language Learning, Hong Kong, China, pp. 538–549, 2019. DOI: https://doi.org/10.18653/v1/K19-1050.

  14. C. Pfeiffer, N. Hollenstein, C. Zhang, N. Langer. Neural dynamics of sentiment processing during naturalistic senttence reading. NeuroImage, vol. 218, Article number 116934, 2020. DOI: https://doi.org/10.1016/j.neuroimage.2020.116934.

  15. N. Hollenstein, C. Renggli, B. Glaus, M. Barrettt, M. Troendle, N. Langer,, C. Zhang. Decoding EEG brain activity for multi-modal natural language processing. Frontiers in Human Neuroscience, vol. 15, Article number 659410, 2021. DOI: https://doi.org/10.3389/fnhum.2021.659410.

  16. N. Hollenstein, L. Beinborn. Relative importance in sentence processing. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 141–150, 2021. DOI: https://doi.org/10.18653/v1/2021.acl-short.19.

  17. D. Merkx, S. L. Frank. Human sentence processing: Recurrence or attention? In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, pp. 12–22, 2021. DOI: https://doi.org/10.18653/v1/2021.cmcl-1.2.

  18. J. P. Kincaid, R. P. Jr. Fishburne, R. L. Rogers, B. S. Chissom. Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel, Research Branch Report 8–75, Institute for Simulation and Training, University of Central Florida, USA, 1975.

    Google Scholar 

  19. S. Vajjala, D. Meurers. On improving the accuracy of readability classification using insights from second language acquisition. In Proceedings of the 7th Workshop on Building Educational Applications Using NLP, Montreal, Canada, pp. 163–173, 2012.

  20. S. Vajjala, I. Lučić. OneStopEnglish corpus: A new corpus for automatic readability assessment and text simplification. In Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications, New Orleans, USA, pp. 297–304, 2018. DOI: https://doi.org/10.18653/v1/W18-0535.

  21. J. Devlin, M. W. Chang, K. Lee, K. Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, USA, pp. 4171–4186, 2019. DOI: https://doi.org/10.18653/v1/N19-1423.

  22. A. Fenk, G. Fenk-Oczlon. Konstanz im kurzzeitgedächtniskonstanz im sprachlichen informationsfluss. Zeitschrift für Experimentelle und Angewandte Psychologie, vol. 27, no. 3, pp. 400–414, 1980.

    CAS  PubMed  Google Scholar 

  23. D. Genzel, E. Charniak. Entropy rate constancy in text. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, USA, pp. 199–206, 2002. DOI: https://doi.org/10.3115/1073083.1073117.

  24. M. Aylett, A. Turk. The smooth signal redundancy hypothesis: A functional explanation for relationships between redundancy, prosodic prominence, and duration in spontaneous speech. Language and Speech, vol. 47, no. 1, pp. 31–56, 2004. DOI: https://doi.org/10.1177/00238309040470010201.

    Article  PubMed  Google Scholar 

  25. R. Levy, T. F. Jaeger. Speakers optimize information density through syntactic reduction. In Proceedings of the 19th International Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 849–856, 2006.

  26. C. E. Shannon. A mathematical theory of communication. The Bell System Technical Journal, vol. 27, no. 3, pp. 379–423, 1948. DOI: https://doi.org/10.1002/j.1538-7305.1948.tb01338.x.

    Article  MathSciNet  Google Scholar 

  27. C. Meister, T. Pimentel, P. Haller, L. Jäger, R. Cotterell, R. Levy. Revisiting the uniform information density hypothesis. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, Punta Cana, Dominican Republic, pp. 963–980, 2021. DOI: https://doi.org/10.18653/v1/2021.emnlp-main.74.

  28. A. Wang, A. Singh, J. Michael, F. Hill, O. Levy, S. Bowman. GLUE: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, Brussels, Belgium, pp. 353–355, 2018. DOI: https://doi.org/10.18653/v1/W18-5446.

  29. M. Bugert, Y. Puzikov, A. Rücklé, J. Eckle-Kohler, T. Martin, E. Martínez-Cámara, D. Sorokin, M. Peyrard, I. Gurevych. LSDSem 2017: Exploring data generation methods for the story cloze test. In Proceedings of the 2nd Workshop on Linking Models of Lexical, Sentential and Discourse-level Semantics, Valencia, Spain, pp. 56–61, 2017. DOI: https://doi.org/10.18653/v1/W17-0908.

  30. S. Merity, C. M. Xiong, J. Bradbury, R. Socher. Pointer sentinel mixture models. In Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.

  31. P. Rajpurkar, R. Jia, P. Liang. Know what you don’t know: Unanswerable questions for SQuAD. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Melbourne, Australia, pp. 784–789, 2018. DOI: https://doi.org/10.18653/v1/P18-2124.

  32. E. F. T. K. Sang, F. De Meulder. Introduction to the CoN-LL-2003 shared task: Language-independent named entity recognition. In Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL 2003, Edmonton, Canada, pp. 142–147, 2003.

  33. S. Hochreiter, J. Schmidhuber. Long short-term memory. Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997. DOI: https://doi.org/10.1162/neco.1997.9.8.1735.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We would like to gratefully acknowledge the support of the National Natural Science Foundation of China (Nos. U22B2059, 62176079), National Natural Science Foundation of Heilongjiang Province, China (No. YQ 2022F005) and the Industry-University-Research Innovation Foundation of China University (2021ITA05009).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao Ding.

Ethics declarations

The authors declared that they have no conflicts of interest to this work.

Additional information

Colored figures are available in the online version at https://link.springer.com/journal/11633

Bowen Chen received the M. Sc. degree in cyberspace security from Harbin Institute of Technology, China in 2022. He used to study at the Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology, China. He is currently a Ph. D. degree candidate at Graduate School of Information Science and Technology, University of Tokyo, Japan.

His research interests include cognition-inspired natural language processing, temporal commonsense inference and dialogue system.

Xiao Ding received the Ph. D. degree in computer science from the School of Computer Science and Technology, Harbin Institute of Technology, China in 2016 where he is currently a professor.

His research interests include natural language processing, text mining, social computing, and common-sense inference.

Yi Zhao is a master student of the Harbin Institute of Technology, China. He is now studying at the Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology, China.

His research interest is cognition-inspired natural language processing.

Bo Fu received the Ph. D. degree in computer science from Harbin Institute of Technology, China in 2015. She is an NLP algorithm engineer at foundation technology center of CCB Fintech co. ltd, China.

Her research interests include pre-trained language models and dialogue systems.

Tingmao Lin received the B. Sc. degree in computer science from Peking University, China in 2009. Currently, he is a machine learning engineer at foundation technology center of CCB Fintech co. ltd, China.

His research interests include stock market prediction, natural language processing and representation learning.

Bing Qin received the Ph. D. degree in computer science from the Department of Computer Science, Harbin Institute of Technology, China in 2005. She is currently a full professor in the Department of Computer Science, and the director of Research Center for Social Computing and Information Retrieval (HIT-SCIR), Harbin Institute of Technology, China.

Her research interests include natural language processing, information extraction, document-level discourse analysis, and sentiment analysis.

Ting Liu received the Ph. D. degree in computer science from the Department of Computer Science, Harbin Institute of Technology, China in 1998. He is currently a full professor in the Department of Computer Science, and the director of the Research Center for Social Computing and Information Retrieval (HIT-SCIR), Harbin Institute of Technology, China.

His research interests include information retrieval, natural language processing, and social media analysis.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, B., Ding, X., Zhao, Y. et al. Text Difficulty Study: Do Machines Behave the Same as Humans Regarding Text Difficulty?. Mach. Intell. Res. 21, 283–293 (2024). https://doi.org/10.1007/s11633-023-1424-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11633-023-1424-x

Keywords

Navigation