Automatic Reflection Level Classification in Hungarian Student Essays

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Natural Language Processing · Depth: Advanced, quick

Summary

A new study presents the first comprehensive research on automatic reflection level classification in Hungarian student essays. Researchers utilized a large, expert-annotated dataset of 1,954 essays, labeled across a four-level reflection scale. The investigation explored two primary approaches: classical machine learning models employing TF-IDF and semantic embedding features, and Hungarian-specific transformer models fine-tuned for document-level classification. To mitigate significant class imbalance within the dataset, various strategies were systematically examined, including class weighting, oversampling, data augmentation, and alternative loss functions. An extensive ablation study analyzed the contribution of each modeling and balancing technique. Results indicate that shallow machine learning models with effective feature engineering achieved strong overall performance, reaching up to 71% averaged across accuracy, F1-score, and ROC AUC, while transformer-based models achieved 68% but showed better generalization on minority classes.

Key takeaway

For NLP Engineers developing educational assessment tools for morphologically rich languages, consider starting with classical machine learning models. While transformers offer robust generalization for minority classes, simpler models can achieve competitive overall performance (up to 71% score) with careful feature engineering, potentially reducing computational overhead and development complexity for initial deployments.

Key insights

Automated reflection classification in Hungarian essays is feasible using both classical ML and fine-tuned transformers.

Principles

Method

The study used expert-annotated essays, comparing classical ML with TF-IDF/embeddings against fine-tuned Hungarian transformers, addressing class imbalance via weighting, oversampling, augmentation, and alternative loss functions.

In practice

Topics

Best for: AI Scientist, Research Scientist, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.