Towards Generalized Methods for Automatic Question Generation in Educational Domains

Nguyen, Huy A.; Bhat, Shravya; Moore, Steven; Bier, Norman; Stamper, John

doi:10.1007/978-3-031-16290-9_20

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13450))

Included in the following conference series:

European Conference on Technology Enhanced Learning

2371 Accesses
4 Citations

Abstract

Students learn more from doing activities and practicing their skills on assessments, yet it can be challenging and time consuming to generate such practice opportunities. In our work, we examine how advances in natural language processing and question generation may help address this issue. In particular, we present a pipeline for generating and evaluating questions from text-based learning materials in an introductory data science course. The pipeline includes applying a text-to-text transformer (T5) question generation model and a concept hierarchy extraction model on the text content, then scoring the generated questions based on their relevance to the extracted key concepts. We further evaluated the question quality with three different approaches: information score, automated rating by a trained model (Google GPT-3) and manual review by human instructors. Our results showed that the generated questions were rated favorably by all three evaluation methods. We conclude with a discussion of the strengths and weaknesses of the generated questions and outline the next steps towards refining the pipeline and promoting natural language processing research in educational domains.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.crummy.com/software/BeautifulSoup/bs4/doc/.
2.
We used the hyperparameter set suggested in https://beta.openai.com/docs/guides/fine-tuning.
3.
With our question generation routine (Fig. 1), the text content in each Topic was used as input three times, which could lead to duplicate questions, even if the accompanying header names were different.
4.
https://github.com/MCDS-Foundations/data-science-question-generation.

References

Aguilera-Hermida, A.P.: College students’ use and acceptance of emergency online learning due to COVID-19. Int. J. Educ. Res. Open. 1, 100011 (2020)
Article Google Scholar
Ai, R., Krause, S., Kasper, W., Xu, F., Uszkoreit, H.: Semi-automatic generation of multiple-choice tests from mentions of semantic relations. In: Proceedings of the 2nd Workshop on Natural Language Processing Techniques for Educational Applications, pp. 26–33 (2015)
Google Scholar
Alberti, C., Andor, D., Pitler, E., Devlin, J., Collins, M.: Synthetic QA corpora generation with roundtrip consistency. arXiv preprint arXiv:1906.05416 (2019)
Amidei, J., Piwek, P., Willis, A.: Evaluation methodologies in automatic question generation 2013–2018 (2018)
Google Scholar
Baviskar, D., Ahirrao, S., Potdar, V., Kotecha, K.: Efficient automated processing of the unstructured documents using artificial intelligence: a systematic literature review and future directions. IEEE Access 9, 72894–72936 (2021)
Article Google Scholar
Brown, T., et al.: Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877–1901 (2020)
Google Scholar
Chan, Y.-H., Fan, Y.-C.: A recurrent BERT-based model for question generation. In: Proceedings of the 2nd Workshop on Machine Reading for Question Answering, pp. 154–162 (2019)
Google Scholar
Chen, G., Yang, J., Hauff, C., Houben, G.-J.: LearningQ: a large-scale dataset for educational question generation. In: Twelfth International AAAI Conference on Web and Social Media (2018)
Google Scholar
Cheng, Y., et al.: Guiding the growth: difficulty-controllable question generation through step-by-step rewriting. arxiv preprint arXiv:2105.11698 (2021)
Chiu, K.-L., Alexander, R.: Detecting hate speech with gpt-3. arXiv preprint arXiv:2103.12407 (2021)
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Dimitrakis, E., Sgontzos, K., Tzitzikas, Y.: A survey on question answering systems over linked data and documents. J. Intell. Inf. Syst. 55(2), 233–259 (2019). https://doi.org/10.1007/s10844-019-00584-7
Article Google Scholar
Du, X., Shao, J., Cardie, C.: Learning to ask: neural question generation for reading comprehension. arXiv preprint arXiv:1705.00106 (2017)
Ferrucci, D., et al.: Building watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010)
Google Scholar
Han, B., Burdick, D., Lewis, D., Lu, Y., Motahari, H., Tata, S.: DI-2021: the second document intelligence workshop. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pp. 4127–4128 (2021)
Google Scholar
Hodges, C.B., Moore, S., Lockee, B.B., Trust, T., Bond, M.A.: The difference between emergency remote teaching and online learning (2020)
Google Scholar
Huang, H., Kajiwara, T., Arase, Y.: Definition modelling for appropriate specificity. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 2499–2509 (2021)
Google Scholar
Jia, X., Wang, H., Yin, D., Wu, Y.: Enhancing question generation with commonsense knowledge. In: China National Conference on Chinese Computational Linguistics, pp. 145–160. Springer (2021) https://doi.org/10.1007/978-3-030-84186-7_10
Kalman, R., Macias Esparza, M., Weston, C.: Student views of the online learning process during the COVID-19 pandemic: a comparison of upper-level and entry-level undergraduate perspectives. J. Chem. Educ. 97(9), 3353–3357 (2020)
Article Google Scholar
Koedinger, K.R., Corbett, A.T., Perfetti, C.: The knowledge-learning-instruction framework: bridging the science-practice chasm to enhance robust student learning. Cogn. Sci. 36(5), 757–798 (2012)
Article Google Scholar
Krathwohl, D.R.: A revision of Bloom’s taxonomy: an overview. Theor. Pract. 41(4), 212–218 (2002)
Article Google Scholar
Kurdi, G., Leo, J., Parsia, B., Sattler, U., Al-Emari, S.: A systematic review of automatic question generation for educational purposes. Int. J. Artif. Intell. Educ. 30(1), 121–204 (2020). https://doi.org/10.1007/s40593-019-00186-y
Article Google Scholar
Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977)
Article Google Scholar
Liu, B.: Neural question generation based on Seq2Seq. In: Proceedings of the 2020 5th International Conference on Mathematics and Artificial Intelligence, pp. 119–123 (2020)
Google Scholar
Liu, R., Koedinger, K.R.: Closing the loop: automated data-driven cognitive model discoveries lead to improved instruction and learning gains. J. Educ. Data Min. 9(1), 25–41 (2017)
Google Scholar
Liu, R., McLaughlin, E.A., Koedinger, K.R.: Interpreting model discovery and testing generalization to a new dataset. In: Educational Data Mining 2014. Citeseer (2014)
Google Scholar
Liu, T., Fang, Q., Ding, W., Li, H., Wu, Z., Liu, Z.: Mathematical word problem generation from commonsense knowledge graph and equations. arXiv preprint arXiv:2010.06196 (2020)
Lopez, L.E., Cruz, D.K., Cruz, J.C.B., Cheng, C.: Transformer-based end-to-end question generation. arXiv preprint arXiv:2005.01107, vol. 4 (2020)
Moore, S., Nguyen, H.A., Stamper, J.: Examining the effects of student participation and performance on the quality of learnersourcing multiple-choice questions. In: Proceedings of the Eighth ACM Conference on Learning@ Scale, pp. 209–220 (2021)
Google Scholar
Motahari, H., Duffy, N., Bennett, P., Bedrax-Weiss, T.: A report on the first workshop on document intelligence (DI) at NeurIPS 2019. ACM SIGKDD Explor. Newsl. 22(2), 8–11 (2021)
Article Google Scholar
Novikova, J., Dušek, O., Curry, A.C., Rieser, V.: Why we need new evaluation metrics for NLG. arXiv preprint arXiv:1707.06875 (2017)
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 (2016)
Ritter, S., Yudelson, M., Fancsali, S.E., Berman, S.R.: How mastery learning works at scale. In: Proceedings of the Third (2016) ACM Conference on Learning@ Scale, pp. 71–79 (2016)
Google Scholar
Ruseti, S., et al.: Predicting question quality using recurrent neural networks. In: Penstein Rosé, C., et al. (eds.) artificial intelligence in education. LNCS (LNAI), vol. 10947, pp. 491–502. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93843-1_36
Chapter Google Scholar
Rushkin, I., et al.: Adaptive assessment experiment in a HarvardX MOOC. In: EDM (2017)
Google Scholar
Sai, A.B., Mohankumar, A.K., Khapra, M.M.: A survey of evaluation metrics used for NLG systems. arXiv preprint arXiv:2008.12009 (2020)
Sha, L., et al.: Which hammer should i use? A systematic evaluation of approaches for classifying educational forum posts. Int. Educ. Data Min. Soc. (2021)
Google Scholar
Stamper, J.C., Koedinger, K.R.: Human-machine student model discovery and improvement using DataShop. In: Biswas, G., Bull, S., Kay, J., Mitrovic, A. (eds.) International Conference on Artificial Intelligence in Education, pp. 353–360. Springer (2011). https://doi.org/10.1007/978-3-642-21869-9_46
Steuer, T., Bongard, L., Uhlig, J., Zimmer, G.: On the linguistic and pedagogical quality of automatic question generation via neural machine translation. In: European Conference on Technology Enhanced Learning, pp. 289–294. Springer (2021) https://doi.org/10.1007/978-3-030-86436-1_22
Sultan, M.A., Chandel, S., Astudillo, R.F., Castelli, V.: On the importance of diversity in question generation for QA. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5651–5656 (2020)
Google Scholar
Van Der Lee, C., Gatt, A., Van Miltenburg, E., Wubben, S., Krahmer, E.: Best practices for the human evaluation of automatically generated text. In: Proceedings of the 12th International Conference on Natural Language Generation, pp. 355–368 (2019)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Wang, S., et al.: PathQG: neural question generation from facts. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 9066–9075 (2020)
Google Scholar
Wang, Z., Rao, S., Zhang, J., Qin, Z., Tian, G., Wang, J.: Diversify question generation with continuous content selectors and question type modeling. In: Findings of the Association for Computational Linguistics: EMNLP 2020, pp. 2134–2143 (2020)
Google Scholar
Xue, L., et al.: mT5: a massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934 (2020)
Yu, J., et al.: MOOCCubeX: a large knowledge-centered repository for adaptive learning in MOOCs. In: Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pp. 4643–4652 (2021)
Google Scholar
Zhang, R., Guo, J., Chen, L., Fan, Y., Cheng, X.: A review on question generation from natural language text. ACM Trans. Inf. Syst. (TOIS) 40(1), 1–43 (2021)
Google Scholar
Zhong, R., Lee, K., Zhang, Z., Klein, D.: Adapting language models for zero-shot learning by meta-tuning on dataset and prompt collections. arXiv preprint arXiv:2104.04670 (2021)
Zhu, F., Lei, W., Wang, C., Zheng, J., Poria, S., Chua, T.-S.: Retrieving and reading: a comprehensive survey on open-domain question answering. arXiv preprint arXiv:2101.00774 (2021)

Download references

Author information

Authors and Affiliations

Carnegie Mellon University, Pittsburgh, PA, 15213, USA
Huy A. Nguyen, Shravya Bhat, Steven Moore, Norman Bier & John Stamper

Authors

Huy A. Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Shravya Bhat
View author publications
You can also search for this author in PubMed Google Scholar
Steven Moore
View author publications
You can also search for this author in PubMed Google Scholar
Norman Bier
View author publications
You can also search for this author in PubMed Google Scholar
John Stamper
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huy A. Nguyen .

Editor information

Editors and Affiliations

Pontificia Universidad Católica de Chile, Santiago, Chile
Isabel Hilliger
Universidad Carlos III de Madrid, Madrid, Spain
Pedro J. Muñoz-Merino
KU Leuven, Leuven, Belgium
Tinne De Laet
Universidad de Valladolid, Valladolid, Spain
Alejandro Ortega-Arranz
The Open University, Milton Keynes, UK
Tracie Farrell

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nguyen, H.A., Bhat, S., Moore, S., Bier, N., Stamper, J. (2022). Towards Generalized Methods for Automatic Question Generation in Educational Domains. In: Hilliger, I., Muñoz-Merino, P.J., De Laet, T., Ortega-Arranz, A., Farrell, T. (eds) Educating for a New Future: Making Sense of Technology-Enhanced Learning Adoption. EC-TEL 2022. Lecture Notes in Computer Science, vol 13450. Springer, Cham. https://doi.org/10.1007/978-3-031-16290-9_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-16290-9_20
Published: 05 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16289-3
Online ISBN: 978-3-031-16290-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics