short-paper

Open Access

Towards Scalable Vocabulary Acquisition Assessment with BERT

Authors:
Zhongdi Wu

Southern Methodist University, Dallas, TX, USA

Southern Methodist University, Dallas, TX, USA

0000-0001-5660-0038
View Profile

,
Eric Larson

Southern Methodist University, Dallas, TX, USA

Southern Methodist University, Dallas, TX, USA

0000-0001-6040-868X
View Profile

,
Makoto Sano

Mack-the-Psych.com, Tokyo, Japan

Mack-the-Psych.com, Tokyo, Japan

0000-0001-9870-1291
View Profile

,
Doris Baker

University of Texas at Austin, Austin, TX, USA

University of Texas at Austin, Austin, TX, USA

0000-0001-8517-9799
View Profile

,
Nathan Gage

Southern Methodist University, Dallas, TX, USA

Southern Methodist University, Dallas, TX, USA

0000-0003-2837-3653
View Profile

,
Akihito Kamata

Southern Methodist University, Dallas, TX, USA

Southern Methodist University, Dallas, TX, USA

0000-0001-9570-1464
View Profile

L@S '23: Proceedings of the Tenth ACM Conference on Learning @ ScaleJuly 2023Pages 272–276https://doi.org/10.1145/3573051.3596170

Published:20 July 2023Publication History

L@S '23: Proceedings of the Tenth ACM Conference on Learning @ Scale

Pages 272–276

ABSTRACT

In this investigation we propose new machine learning methods for automated scoring models that predict the vocabulary acquisition in science and social studies of second grade English language learners, based upon free-form spoken responses. We evaluate performance on an existing dataset and use transfer learning from a large pre-trained language model, reporting the influence of various objective function designs and the input-convex network design. In particular, we find that combining objective functions with varying properties, such as distance among scores, greatly improves the model reliability compared to human raters. Our models extend the current state of the art performance for assessing word definition tasks and sentence usage tasks in science and social studies, achieving excellent quadratic weighted kappa scores compared with human raters. However, human-human agreement still surpasses model-human agreement, leaving room for future improvement. Even so, our work highlights the scalability of automated vocabulary assessment of free-form spoken language tasks in early grades.

References

Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, and Masanori Koyama. 2019. Optuna: A Next-Generation Hyperparameter Optimization Framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery Data Mining (Anchorage, AK, USA) (KDD '19). Association for Computing Machinery, New York, NY, USA, 2623--2631. https://doi.org/10.1145/3292500.3330701Google ScholarDigital Library
Brandon Amos, Lei Xu, and J. Zico Kolter. 2017. Input Convex Neural Networks. In Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 70), Doina Precup and Yee Whye Teh (Eds.). PMLR, 146--155. https://proceedings.mlr.press/v70/amos17b.htmlGoogle Scholar
Scott Baker, Lana Santoro, David Chard, Hank Fien, Yonghan Park, and Janet Otterstedt. 2013. An Evaluation of an Explicit Read Aloud Intervention Taught in Whole-Classroom Formats In First Grade. The Elementary School Journal 113 (03 2013), 331--358. https://doi.org/10.1086/668503Google Scholar
Linkun Cai, Yu Song, Tao Liu, and Kunli Zhang. 2020. A Hybrid BERT Model That Incorporates Label Semantics via Adjustive Attention for Multi-Label Text Classification. IEEE Access 8 (2020), 152183--152192. https://doi.org/10.1109/ACCESS.2020.3017382Google ScholarCross Ref
Hang Chang, Ju Han, Cheng Zhong, Antoine M. Snijders, and Jian-Hua Mao. 2018. Unsupervised Transfer Learning via Multi-Scale Convolutional Sparse Coding for Biomedical Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 5 (2018), 1182--1194. https://doi.org/10.1109/TPAMI.2017.2656884Google ScholarCross Ref
Jordi de la Torre, Domenec Puig, and Aida Valls. 2018. Weighted kappa loss function for multi-class classification of ordinal data in deep learning. Pattern Recognition Letters 105 (2018), 144--154. https://doi.org/10.1016/j.patrec.2017.05.018 Machine Learning and Applications in Artificial Intelligence.Google ScholarDigital Library
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2--7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 4171--4186. https://doi.org/10.18653/v1/n19--1423Google Scholar
Shang Gao, Mohammed Alawad, M. Todd Young, John Gounley, Noah Schaefferkoetter, Hong Jun Yoon, Xiao-Cheng Wu, Eric B. Durbin, Jennifer Doherty, Antoinette Stroup, Linda Coyle, and Georgia Tourassi. 2021. Limitations of Transformers on Clinical Text Classification. IEEE Journal of Biomedical and Health Informatics 25, 9 (2021), 3596--3607. https://doi.org/10.1109/JBHI.2021.3062322Google ScholarCross Ref
Russell Gersten, Scott Baker, Timothy Shanahan, Sylvia Linan-Thompson, Penelope Collins, and Robin Scarcella. 2007. Effective Literacy and English Language Instruction for English Learners in the Elementary Grades. IES Practice Guide. NCEE 2007--4011. What Works Clearinghouse (01 2007).Google Scholar
Edita Grolman, Andrey Finkelshtein, Rami Puzis, Asaf Shabtai, Gershon Celniker, Ziv Katzir, and Liron Rosenfeld. 2018. Transfer Learning for User Action Identication in Mobile Apps via Encrypted Trafc Analysis. IEEE Intelligent Systems 33, 2 (2018), 40--53. https://doi.org/10.1109/MIS.2018.111145120Google ScholarDigital Library
Ashish Jaiswal, Ashwin Ramesh Babu, Mohammad Zaki Zadeh, Debapriya Banerjee, and Fillia Makedon. 2021. A Survey on Contrastive Self-Supervised Learning. Technologies 9, 1 (2021). https://doi.org/10.3390/technologies9010002Google Scholar
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7--9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1412.6980Google Scholar
Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6--9, 2019. OpenReview.net. https://openreview.net/forum?id=Bkg6RiCqY7Google Scholar
Sinno Jialin Pan and Qiang Yang. 2010. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2010), 1345--1359. https://doi.org/10.1109/TKDE.2009.191Google ScholarDigital Library
Makoto Sano, Doris Baker, Marlen Collazo, Nancy Le, and Akihito Kamata. 2020. Measuring the Expressive Language and Vocabulary of Latino English Learners Using Hand Transcribed Speech Data and Automated Scoring. International Journal of Intelligent Technologies and Applied Statistics 13, 3 (10 2020), 229--258 pages. https://doi.org/10.6148/IJITAS.202009_13(3).0003Google Scholar
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9, 86 (2008), 2579--2605. http://jmlr.org/papers/v9/vandermaaten08a.htmlGoogle Scholar
Pawan Kumar Verma, Prateek Agrawal, Ivone Amorim, and Radu Prodan. 2021. WELFake: Word Embedding Over Linguistic Features for Fake News Detection. IEEE Transactions on Computational Social Systems 8, 4 (2021), 881--893. https://doi.org/10.1109/TCSS.2021.3068519Google ScholarCross Ref
Karl Weiss, Taghi M Khoshgoftaar, and DingDing Wang. 2016. A survey of transfer learning. Journal of Big data 3, 1 (2016), 1--40.Google ScholarCross Ref

Index Terms

Towards Scalable Vocabulary Acquisition Assessment with BERT
1. Applied computing
  1. Education
    1. Computer-managed instruction
2. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

A System for English Vocabulary Acquisition based on Code-Switching

Vocabulary plays an important part in second language learning and there are many existing techniques to facilitate word acquisition. One of these methods is code-switching, or mixing the vocabulary of two languages in one sentence. In this paper the ...
Read More
Neural POS tagging of shahmukhi by using contextualized word representations
Abstract
Part of Speech (POS) tagging has a preliminary role in building natural language processing applications. This paper presents the development and evaluation of the first POS tagged corpus along with a Bi-directional long-short memory (...
Read More
Knowledge acquisition about the deletion possibility of adnominal verb phrases

We propose a method of acquiring knowledge about the possibility of deletion of adnominal verb phrases from a corpus. Our method acquires such items as the frequency of modification of the noun by adnominal verb phrases and the variety of adnominal verb ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
L@S '23: Proceedings of the Tenth ACM Conference on Learning @ Scale
July 2023
445 pages
ISBN:9798400700255
DOI:10.1145/3573051
General Chair:
Daniel Spikol
University of Copenhagen, Denmark
,
Program Chairs:
Olga Viberg
KTH Royal Institute of Technology, Sweden
,
Alejandra Martínez-Monés
Universidad de Valladolid, Spain
,
Philip Guo
UC San Diego, USA
Copyright © 2023 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 July 2023
Check for updates
Author Tags
automated scoring
deep neural networks
human-machine reliability
natural language processing
transfer learning
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate117of440submissions,27%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 128
  Total Downloads
- Downloads (Last 12 months)128
- Downloads (Last 6 weeks)38
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Towards Scalable Vocabulary Acquisition Assessment with BERT

L@S '23: Proceedings of the Tenth ACM Conference on Learning @ Scale

ABSTRACT

References

Cited By

Index Terms

Recommendations

A System for English Vocabulary Acquisition based on Code-Switching

Neural POS tagging of shahmukhi by using contextualized word representations

Knowledge acquisition about the deletion possibility of adnominal verb phrases

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Towards Scalable Vocabulary Acquisition Assessment with BERT

L@S '23: Proceedings of the Tenth ACM Conference on Learning @ Scale

ABSTRACT

References

Cited By

Index Terms

Recommendations

A System for English Vocabulary Acquisition based on Code-Switching

Neural POS tagging of shahmukhi by using contextualized word representations

Knowledge acquisition about the deletion possibility of adnominal verb phrases

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media