Literaturnachweis - Detailanzeige
Autor/inn/en | Gani, Mohammed Osman; Ayyasamy, Ramesh Kumar; Sangodiah, Anbuselvan; Fui, Yong Tien |
---|---|
Titel | Bloom's Taxonomy-Based Exam Question Classification: The Outcome of CNN and Optimal Pre-Trained Word Embedding Technique |
Quelle | In: Education and Information Technologies, 28 (2023) 12, S.15893-15914 (22 Seiten)Infoseite zur Zeitschrift
PDF als Volltext |
Zusatzinformation | ORCID (Gani, Mohammed Osman) ORCID (Ayyasamy, Ramesh Kumar) ORCID (Sangodiah, Anbuselvan) ORCID (Fui, Yong Tien) |
Sprache | englisch |
Dokumenttyp | gedruckt; online; Zeitschriftenaufsatz |
ISSN | 1360-2357 |
DOI | 10.1007/s10639-023-11842-1 |
Schlagwörter | Models; Artificial Intelligence; Natural Language Processing; Writing (Composition); Accuracy; Classification |
Abstract | The automated classification of examination questions based on Bloom's Taxonomy (BT) aims to assist the question setters so that high-quality question papers are produced. Most studies to automate this process adopted the machine learning approach, and only a few utilised the deep learning approach. The pre-trained contextual and non-contextual word embedding techniques effectively solved various natural language processing tasks. This study aims to identify the optimal pre-trained word embedding technique and propose a Convolutional Neural Network (CNN) model with the optimal word embedding technique. Therefore, non-contextual word embedding techniques: Word2vec, GloVe, and FastText, whereas contextualised embedding techniques: BERT, RoBERTa, and ELECTRA, were analysed in this study with two datasets. The experiment results showed that FastText is the most optimal technique in the first dataset, whereas RoBERTa is in the second dataset. This outcome of the first dataset differs from the text classification since contextual embedding generally outperforms non-contextual embedding. It could be due to the comparatively smaller size of the first dataset and the shorter length of the examination questions. Since RoBERTa is the most optimal word embedding technique in the second dataset, hence used along with CNN to build the model. This study used CNN instead of Recurrent Neural Networks (RNNs) since extracting relevant features is more important than the learning sequence from data in the context of examination question classification. The proposed CNN model achieved approximately 86% in both weighted F1-score and accuracy and outperformed all the models proposed by past studies, including RNNs. The proposed model's robustness could be assessed in the future using a more comprehensive dataset. (As Provided). |
Anmerkungen | Springer. Available from: Springer Nature. One New York Plaza, Suite 4600, New York, NY 10004. Tel: 800-777-4643; Tel: 212-460-1500; Fax: 212-460-1700; e-mail: customerservice@springernature.com; Web site: https://link.springer.com/ |
Erfasst von | ERIC (Education Resources Information Center), Washington, DC |
Update | 2024/1/01 |