Towards interpretable models for language proficiency assessment: Predicting the CEFR level of Estonian learner texts

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Intermediate, quick

Summary

A study classified Estonian language proficiency examination writings (levels A2-C1) using carefully selected linguistic features to develop more explainable and generalizable machine learning models for language testing. Researchers analyzed lexical, morphological, surface, and error features from training data to identify predictors of increasing complexity and correctness, independent of the writing task. These pre-selected features were used to train classification models, which achieved a test accuracy of approximately 0.9, comparable to models using a broader feature set, but with reduced variation across different text types. An evaluation on an older exam sample indicated an increase in writing complexity over 7-10 years, with accuracy still reaching 0.8 using specific feature sets. The findings have been integrated into an Estonian open-source language learning environment's writing evaluation module.

Key takeaway

For AI scientists developing automated language assessment tools, focusing on carefully selected linguistic features can lead to more interpretable and robust models. Your models will not only achieve high accuracy, around 0.9 for Estonian, but also generalize better across diverse text types and provide insights into language development over time. Consider integrating these feature-driven approaches into open-source language learning platforms.

Key insights

Careful feature selection yields interpretable, generalizable models for language proficiency assessment with high accuracy.

Principles

Method

The method involves analyzing lexical, morphological, surface, and error features to identify proficiency predictors, then training classification models with these features, and evaluating them against broader feature sets and historical data.

In practice

Topics

Best for: AI Scientist, AI Researcher, NLP Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.