Development of an LLM-Based Adaptive Evaluation System in Smart Tutor Using Bloom's Taxonomy
DOI:
https://doi.org/10.28932/jste.v2i2.13159Keywords:
Adaptive learning, BERTScore, Bloom's Taxonomy, LLM evaluation, Smart TutorAbstract
This research aims to improve adaptive learning systems by developing a Smart Tutor platform enhanced with Large Language Models (LLMs). Traditional systems that rely on keyword matching or cosine similarity often fail to accurately assess student responses due to their lack of semantic understanding. To address this, we propose a semantic-based evaluation system using BERTScore and TF-IDF to analyze student answers more contextually.References
Khairunnisa, & Wulan, Nur. Perancangan Intelligent Tutoring System sebagai Upaya Inovatif pada Pembelajaran Algoritma dan Struktur Data. Universitas Harapan Medan, 2020. [Online]. Available: https://www.researchgate.net/publication/349092065_Perancangan_Intelligent_Tutoring_ System_Sebagai_Upaya_Inovatif_Pada_Pembelajaran_Algoritma_dan_Struktur_Data
Bulathwela, S., Muse, H., & Yilmaz, E. “Scalable Educational Question Generation with Pre-trained Language Models,” arXiv preprint arXiv:2305.07871, 2023. [Online]. Available: https://arxiv.org/abs/2305.07871
Bhowmick, A. K., et al., “Automating Question Generation From Educational Text,” dalam Artificial Intelligence XL. SGAI 2023. Lecture Notes in Computer Science, vol. 14381, Bramer, M., & Stahl, F. (eds), Springer, Cham, 2023, doi: 10.1007/978-3-031-47994-6_38. Available: https://arxiv.org/abs/2309.15004
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” arXiv preprint arXiv:1810.04805, 2019. [Online]. Available: https://arxiv.org/abs/1810.04805
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. “GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding,” arXiv preprint arXiv:1804.07461, 2019. [Online]. Available: https://arxiv.org/abs/1804.07461
Anderson, L. W., & Krathwohl, D. R. A Taxonomy for Learning, Teaching, and Assessing: A Revision of Bloom's Taxonomy of Educational Objectives. Longman, 2001. [Online]. Available: https://quincycollege.edu/wp-content/uploads/Anderson-and-Krathwohl_Revised-Blooms-Taxonomy.pdf
Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q., & Artzi, Y. “BERTscore: Evaluating Text Generation with BERT,” arXiv preprint arXiv:1904.09675, 2020. [Online]. Available: https://arxiv.org/abs/1904.09675
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. “Language Models are Few-Shot Learners,” arXiv preprint arXiv:2005.14165, 2019. [Online]. Available: https://arxiv.org/abs/2005.14165
Hussein, M. A., Hassan, H., & Nassef, M. “Automated Language Essay Scoring Systems: A Literature Review,” PeerJ Computer Science, vol. 5, e208, 2019. [Online]. Available: https://peerj.com/articles/cs-208/
Kurdi, M. H., Al-Madi, N., & Al-Khatib, W. “A Systematic Review of Automated Question Generation for Educational Purposes,” International Journal of Artificial Intelligence in Education, vol. 30, pp. 121–158, 2020. [Online]. Available: https://link.springer.com/article/10.1007/s40593-019-00186-y
Tamkin, A., Brundage, M., Ganguli, D., & Clark, J. “Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models,” arXiv preprint arXiv:2102.02503, 2021. [Online]. Available: https://arxiv.org/abs/2102.02503
Hwang, G.-J., & Chang, S.-C. “A review of opportunities and challenges of chatbots in education,” Smart Learning Environments, vol. 7, no. 1, pp. 1–14, 2020. doi: 10.1186/s41039-019-0098-z. Available: https://www.tandfonline.com/doi/full/10.1080/10494820.2021.1952615
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Nicholas Rudy, Hapnes Toba

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.



