Automatic Evaluation of ENEM Essays: An Empirical Study on Linguistic and Contextual Representations

2026-04-12 · Source: Paper Index on ACL Anthology · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

Automatic Essay Scoring (AES) for Brazilian Portuguese, specifically for the Enem exam, remains a complex task due to its multi-competency assessment and ordinal scoring. This study investigates hybrid modeling strategies for competency-level AES, integrating explicit linguistic features with contextual representations. Researchers utilized the Enem-AES corpus and modeled each competency's evaluation as an ordinal prediction problem using the CORAL framework. The empirical comparison included traditional lexical representations, linguistic metrics from NILC-Metrix, task-oriented manual features, contextual embeddings, and various combinations. Hybrid models demonstrated the highest average agreement with human scores, though performance varied by competency and representation type. The analysis also explored model behavior in rater disagreement scenarios, underscoring annotation variability's impact on performance.

Key takeaway

For research scientists developing AES systems for high-stakes exams like Enem, you should prioritize hybrid modeling strategies that integrate both explicit linguistic features and contextual embeddings. This approach has shown superior agreement with human scores, but be prepared to fine-tune models for individual competencies and account for the impact of human rater disagreement on your system's performance.

Key insights

Hybrid models combining linguistic features and contextual embeddings improve Automatic Essay Scoring for Brazilian Portuguese Enem exams.

Principles

AES performance varies across competencies.
Annotation variability impacts model performance.

Method

The study modeled AES competency evaluation as an ordinal prediction problem using the CORAL framework, comparing lexical, linguistic, manual, and contextual representations on the Enem-AES corpus.

In practice

Combine linguistic features with contextual embeddings.
Use CORAL for ordinal prediction problems.

Topics

Automatic Essay Scoring
Enem Exam
Contextual Embeddings
Linguistic Features
Ordinal Prediction

Best for: Research Scientist, AI Scientist, NLP Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.