Enhancing Science Classroom Discourse Analysis through Joint Multi-Task Learning for Reasoning-Component Classification
Summary
An automated discourse analysis system (ADAS) has been developed to classify teacher and student utterances in science classrooms, focusing on Utterance Type (UT) and Reasoning Component (RC). The system employs a dual-probe head RoBERTa-base classifier, addressing severe label imbalance through stratified corpus re-splitting and LLM-based synthetic data augmentation. The RC coding scheme was revised from six to four classes (ER, SR-D, SR-I, O) to improve learnability and reduce ambiguity. While LLM augmentation significantly improved UT minority-class recognition, achieving a macro-F1 of 0.635, a TF-IDF + Logistic Regression baseline surprisingly outperformed transformer models on RC classification (macro-F1 of 0.574), suggesting lexical separability for RC. Discourse pattern analyses revealed that teacher "Feedback-with-Question" (Fq) moves consistently precede student inferential reasoning (SR-I), and a "Prompt" (P) followed by "Prompt" (P) framing yields the lowest reasoning engagement. The study also identified a positional annotation bias in human-labeled data regarding end-of-lesson cognitive complexity.
Key takeaway
For NLP Engineers developing educational analytics tools, consider that while LLM-based augmentation significantly boosts performance for complex, context-dependent classifications like Utterance Type, simpler lexical models might suffice or even excel for tasks with strong lexical signals, such as Reasoning Component classification. You should prioritize expanding labeled corpora for transformer models to fully leverage their capabilities in RC tasks and be aware of potential human annotation biases when interpreting temporal discourse patterns.
Key insights
Automated discourse analysis using LLM augmentation improves utterance classification and reveals effective teaching patterns.
Principles
- Lexical features can suffice for certain classification tasks.
- Teacher "Feedback-with-Question" (Fq) promotes inferential reasoning.
- Positional bias can influence human annotation outcomes.
Method
A dual-probe head RoBERTa-base classifier is trained with LLM-generated synthetic data and focal loss on a revised 4-class reasoning component scheme, using session-level splitting to prevent data leakage.
In practice
- Use LLM augmentation for minority class data imbalance.
- Prioritize "Feedback-with-Question" (Fq) in teaching.
- Avoid repetitive "Prompt" (P) moves in instruction.
Topics
- Automated Discourse Analysis
- Multi-Task Learning
- Reasoning Component Classification
- LLM Data Augmentation
- RoBERTa-base Model
Best for: NLP Engineer, AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.