An Explainable AI Assistant for Introductory Programming Education: Improving Feedback Reliability with Instructor-AI Collaboration

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Data Science & Analytics · Depth: Advanced, extended

Summary

An AI-driven classroom assistant, Insight, addresses the challenge of providing scalable, reliable, and personalized feedback in introductory programming courses. It integrates an explainable AI model, SANN, to analyze student code and map logical errors to instructor-defined misconceptions, delivering pre-authored feedback. Large Language Models (LLMs) are used for generating synthetic training data and verifying feedback, not for direct student interaction. Evaluated on the FalconCode dataset, a fine-tuned SANN model achieved 88% accuracy in correctness prediction and 97.62% precision/recall for feedback matching after GPT-4o verification reduced incorrect feedback to 1.2%. A classroom deployment with 69 students showed positive perceptions, with 53.6% willing to continue use. Compared to GPT-4o, Insight demonstrated superior selectivity (95% vs. 53%) and pedagogical suitability (100% vs. 47%), ensuring trustworthy, instructionally aligned guidance.

Key takeaway

For AI Engineers developing automated feedback systems for programming education, prioritize a triangulated approach over direct LLM-generated feedback. Your systems should integrate explainable AI models like SANN, leverage instructor-authored pedagogical knowledge, and use LLMs in constrained roles for synthetic data generation and verification. This strategy ensures feedback reliability, pedagogical alignment, and explainability, significantly reducing the risk of propagating incorrect or misleading guidance to students.

Key insights

Instructor-AI collaboration with explainable models and constrained LLMs provides scalable, reliable, and pedagogically sound programming feedback.

Principles

Method

The framework involves pretraining and fine-tuning a SANN model on synthetic data, localizing errors via AST subtree attention, and matching student errors to instructor-authored feedback using cosine similarity, with LLM verification.

In practice

Topics

Best for: AI Scientist, Research Scientist, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.