skip to main content
10.1145/3573051.3596170acmotherconferencesArticle/Chapter ViewAbstractPublication Pagesl-at-sConference Proceedingsconference-collections
short-paper
Open Access

Towards Scalable Vocabulary Acquisition Assessment with BERT

Published:20 July 2023Publication History

ABSTRACT

In this investigation we propose new machine learning methods for automated scoring models that predict the vocabulary acquisition in science and social studies of second grade English language learners, based upon free-form spoken responses. We evaluate performance on an existing dataset and use transfer learning from a large pre-trained language model, reporting the influence of various objective function designs and the input-convex network design. In particular, we find that combining objective functions with varying properties, such as distance among scores, greatly improves the model reliability compared to human raters. Our models extend the current state of the art performance for assessing word definition tasks and sentence usage tasks in science and social studies, achieving excellent quadratic weighted kappa scores compared with human raters. However, human-human agreement still surpasses model-human agreement, leaving room for future improvement. Even so, our work highlights the scalability of automated vocabulary assessment of free-form spoken language tasks in early grades.

References

  1. Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, and Masanori Koyama. 2019. Optuna: A Next-Generation Hyperparameter Optimization Framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery Data Mining (Anchorage, AK, USA) (KDD '19). Association for Computing Machinery, New York, NY, USA, 2623--2631. https://doi.org/10.1145/3292500.3330701Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Brandon Amos, Lei Xu, and J. Zico Kolter. 2017. Input Convex Neural Networks. In Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 70), Doina Precup and Yee Whye Teh (Eds.). PMLR, 146--155. https://proceedings.mlr.press/v70/amos17b.htmlGoogle ScholarGoogle Scholar
  3. Scott Baker, Lana Santoro, David Chard, Hank Fien, Yonghan Park, and Janet Otterstedt. 2013. An Evaluation of an Explicit Read Aloud Intervention Taught in Whole-Classroom Formats In First Grade. The Elementary School Journal 113 (03 2013), 331--358. https://doi.org/10.1086/668503Google ScholarGoogle Scholar
  4. Linkun Cai, Yu Song, Tao Liu, and Kunli Zhang. 2020. A Hybrid BERT Model That Incorporates Label Semantics via Adjustive Attention for Multi-Label Text Classification. IEEE Access 8 (2020), 152183--152192. https://doi.org/10.1109/ACCESS.2020.3017382Google ScholarGoogle ScholarCross RefCross Ref
  5. Hang Chang, Ju Han, Cheng Zhong, Antoine M. Snijders, and Jian-Hua Mao. 2018. Unsupervised Transfer Learning via Multi-Scale Convolutional Sparse Coding for Biomedical Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 5 (2018), 1182--1194. https://doi.org/10.1109/TPAMI.2017.2656884Google ScholarGoogle ScholarCross RefCross Ref
  6. Jordi de la Torre, Domenec Puig, and Aida Valls. 2018. Weighted kappa loss function for multi-class classification of ordinal data in deep learning. Pattern Recognition Letters 105 (2018), 144--154. https://doi.org/10.1016/j.patrec.2017.05.018 Machine Learning and Applications in Artificial Intelligence.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2--7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171--4186. https://doi.org/10.18653/v1/n19--1423Google ScholarGoogle Scholar
  8. Shang Gao, Mohammed Alawad, M. Todd Young, John Gounley, Noah Schaefferkoetter, Hong Jun Yoon, Xiao-Cheng Wu, Eric B. Durbin, Jennifer Doherty, Antoinette Stroup, Linda Coyle, and Georgia Tourassi. 2021. Limitations of Transformers on Clinical Text Classification. IEEE Journal of Biomedical and Health Informatics 25, 9 (2021), 3596--3607. https://doi.org/10.1109/JBHI.2021.3062322Google ScholarGoogle ScholarCross RefCross Ref
  9. Russell Gersten, Scott Baker, Timothy Shanahan, Sylvia Linan-Thompson, Penelope Collins, and Robin Scarcella. 2007. Effective Literacy and English Language Instruction for English Learners in the Elementary Grades. IES Practice Guide. NCEE 2007--4011. What Works Clearinghouse (01 2007).Google ScholarGoogle Scholar
  10. Edita Grolman, Andrey Finkelshtein, Rami Puzis, Asaf Shabtai, Gershon Celniker, Ziv Katzir, and Liron Rosenfeld. 2018. Transfer Learning for User Action Identication in Mobile Apps via Encrypted Trafc Analysis. IEEE Intelligent Systems 33, 2 (2018), 40--53. https://doi.org/10.1109/MIS.2018.111145120Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ashish Jaiswal, Ashwin Ramesh Babu, Mohammad Zaki Zadeh, Debapriya Banerjee, and Fillia Makedon. 2021. A Survey on Contrastive Self-Supervised Learning. Technologies 9, 1 (2021). https://doi.org/10.3390/technologies9010002Google ScholarGoogle Scholar
  12. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7--9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1412.6980Google ScholarGoogle Scholar
  13. Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6--9, 2019. OpenReview.net. https://openreview.net/forum?id=Bkg6RiCqY7Google ScholarGoogle Scholar
  14. Sinno Jialin Pan and Qiang Yang. 2010. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2010), 1345--1359. https://doi.org/10.1109/TKDE.2009.191Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Makoto Sano, Doris Baker, Marlen Collazo, Nancy Le, and Akihito Kamata. 2020. Measuring the Expressive Language and Vocabulary of Latino English Learners Using Hand Transcribed Speech Data and Automated Scoring. International Journal of Intelligent Technologies and Applied Statistics 13, 3 (10 2020), 229--258 pages. https://doi.org/10.6148/IJITAS.202009_13(3).0003Google ScholarGoogle Scholar
  16. Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9, 86 (2008), 2579--2605. http://jmlr.org/papers/v9/vandermaaten08a.htmlGoogle ScholarGoogle Scholar
  17. Pawan Kumar Verma, Prateek Agrawal, Ivone Amorim, and Radu Prodan. 2021. WELFake: Word Embedding Over Linguistic Features for Fake News Detection. IEEE Transactions on Computational Social Systems 8, 4 (2021), 881--893. https://doi.org/10.1109/TCSS.2021.3068519Google ScholarGoogle ScholarCross RefCross Ref
  18. Karl Weiss, Taghi M Khoshgoftaar, and DingDing Wang. 2016. A survey of transfer learning. Journal of Big data 3, 1 (2016), 1--40.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Towards Scalable Vocabulary Acquisition Assessment with BERT

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Article Metrics

        • Downloads (Last 12 months)128
        • Downloads (Last 6 weeks)38

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader