An Explainable AI Assistant for Introductory Programming Education: Improving Feedback Reliability with Instructor-AI Collaboration
Summary
An AI-driven classroom assistant, Insight, addresses the challenge of providing scalable, reliable, and personalized feedback in introductory programming courses. It integrates an explainable AI model, SANN, to analyze student code and map logical errors to instructor-defined misconceptions, delivering pre-authored feedback. Large Language Models (LLMs) are used for generating synthetic training data and verifying feedback, not for direct student interaction. Evaluated on the FalconCode dataset, a fine-tuned SANN model achieved 88% accuracy in correctness prediction and 97.62% precision/recall for feedback matching after GPT-4o verification reduced incorrect feedback to 1.2%. A classroom deployment with 69 students showed positive perceptions, with 53.6% willing to continue use. Compared to GPT-4o, Insight demonstrated superior selectivity (95% vs. 53%) and pedagogical suitability (100% vs. 47%), ensuring trustworthy, instructionally aligned guidance.
Key takeaway
For AI Engineers developing automated feedback systems for programming education, prioritize a triangulated approach over direct LLM-generated feedback. Your systems should integrate explainable AI models like SANN, leverage instructor-authored pedagogical knowledge, and use LLMs in constrained roles for synthetic data generation and verification. This strategy ensures feedback reliability, pedagogical alignment, and explainability, significantly reducing the risk of propagating incorrect or misleading guidance to students.
Key insights
Instructor-AI collaboration with explainable models and constrained LLMs provides scalable, reliable, and pedagogically sound programming feedback.
Principles
- Ground AI feedback in instructor-defined pedagogical knowledge.
- Use explainable AI to identify specific code regions for targeted feedback.
- Generate synthetic data to adapt AI models to new problems.
Method
The framework involves pretraining and fine-tuning a SANN model on synthetic data, localizing errors via AST subtree attention, and matching student errors to instructor-authored feedback using cosine similarity, with LLM verification.
In practice
- Highlight code regions using SANN's attention mechanism for explainable feedback.
- Generate synthetic code with LLMs for model fine-tuning on new problems.
- Integrate a GPT-4o verification layer to reduce incorrect automated feedback.
Topics
- Explainable AI
- Programming Education
- Automated Feedback
- Large Language Models
- SANN Model
- Synthetic Data
Best for: AI Scientist, Research Scientist, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.