Development of an LLM-Based Adaptive Evaluation System in Smart Tutor Using Bloom's Taxonomy

Authors

  • Nicholas Rudy Program Studi Teknik Informatika, Fakultas Teknologi dan Rekayasa Cerdas, Universitas Kristen Maranatha
  • Hapnes Toba Program Studi Teknik Informatika, Fakultas Teknologi dan Rekayasa Cerdas, Universitas Kristen Maranatha

DOI:

https://doi.org/10.28932/jste.v2i2.13159

Keywords:

Adaptive learning, BERTScore, Bloom's Taxonomy, LLM evaluation, Smart Tutor

Abstract

This research aims to improve adaptive learning systems by developing a Smart Tutor platform enhanced with Large Language Models (LLMs). Traditional systems that rely on keyword matching or cosine similarity often fail to accurately assess student responses due to their lack of semantic understanding. To address this, we propose a semantic-based evaluation system using BERTScore and TF-IDF to analyze student answers more contextually.

References

Khairunnisa, & Wulan, Nur. Perancangan Intelligent Tutoring System sebagai Upaya Inovatif pada Pembelajaran Algoritma dan Struktur Data. Universitas Harapan Medan, 2020. [Online]. Available: https://www.researchgate.net/publication/349092065_Perancangan_Intelligent_Tutoring_ System_Sebagai_Upaya_Inovatif_Pada_Pembelajaran_Algoritma_dan_Struktur_Data

Bulathwela, S., Muse, H., & Yilmaz, E. “Scalable Educational Question Generation with Pre-trained Language Models,” arXiv preprint arXiv:2305.07871, 2023. [Online]. Available: https://arxiv.org/abs/2305.07871

Bhowmick, A. K., et al., “Automating Question Generation From Educational Text,” dalam Artificial Intelligence XL. SGAI 2023. Lecture Notes in Computer Science, vol. 14381, Bramer, M., & Stahl, F. (eds), Springer, Cham, 2023, doi: 10.1007/978-3-031-47994-6_38. Available: https://arxiv.org/abs/2309.15004

Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” arXiv preprint arXiv:1810.04805, 2019. [Online]. Available: https://arxiv.org/abs/1810.04805

Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. “GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding,” arXiv preprint arXiv:1804.07461, 2019. [Online]. Available: https://arxiv.org/abs/1804.07461

Anderson, L. W., & Krathwohl, D. R. A Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom's Taxonomy of Educational Objectives. Longman, 2001. [Online]. Available: https://quincycollege.edu/wp-content/uploads/Anderson-and-Krathwohl_Revised-Blooms-Taxonomy.pdf

Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., & Artzi, Y. “BERTscore: Evaluating Text Generation with BERT,” arXiv preprint arXiv:1904.09675, 2020. [Online]. Available: https://arxiv.org/abs/1904.09675

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. “Language Models are Few-Shot Learners,” arXiv preprint arXiv:2005.14165, 2019. [Online]. Available: https://arxiv.org/abs/2005.14165

Hussein, M. A., Hassan, H., & Nassef, M. “Automated Language Essay Scoring Systems: A Literature Review,” PeerJ Computer Science, vol. 5, e208, 2019. [Online]. Available: https://peerj.com/articles/cs-208/

Kurdi, M. H., Al-Madi, N., & Al-Khatib, W. “A Systematic Review of Automated Question Generation for Educational Purposes,” International Journal of Artificial Intelligence in Education, vol. 30, pp. 121–158, 2020. [Online]. Available: https://link.springer.com/article/10.1007/s40593-019-00186-y

Tamkin, A., Brundage, M., Ganguli, D., & Clark, J. “Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models,” arXiv preprint arXiv:2102.02503, 2021. [Online]. Available: https://arxiv.org/abs/2102.02503

Hwang, G.-J., & Chang, S.-C. “A review of opportunities and challenges of chatbots in education,” Smart Learning Environments, vol. 7, no. 1, pp. 1–14, 2020. doi: 10.1186/s41039-019-0098-z. Available: https://www.tandfonline.com/doi/full/10.1080/10494820.2021.1952615

Published

2026-06-30

Issue

Section

Articles