Learning to Prompt: Improving Student Engagement with Adaptive LLM-based High-School Tutoring

2026-06-19 · Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, extended

Summary

A novel subject-aware adaptive LLM-based tutoring system has been developed and tested to enhance student engagement in diverse academic disciplines. This system employs a prompt routing model, trained initially in a simulation environment using 14 pedagogical features derived from conversation transcripts. The router dynamically selects from 20 instructional strategies, outperforming two static baselines in simulation with a score of 0.694 versus 0.647 and 0.64 (p<0.001). Deployed in a real-world A/B test with 359 Dutch high-school students across 656 conversations, the adaptive system demonstrated successful sim-to-real transfer. It improved instructional efficiency by reducing interaction turns by approximately 3 (p=0.007) and maintained pedagogical quality. While a greedy router achieved a 19.1% exercise conversion rate, comparable to the 19.6% baseline, a stochastic router significantly boosted conversion to 28.1%.

Key takeaway

For machine learning engineers developing adaptive educational AI, integrating subject-aware prompt routing is crucial. You should prioritize systems that dynamically adjust pedagogical strategies based on real-time feedback, as this approach reduces interaction turns and boosts exercise conversion rates. Consider implementing a stochastic exploration mode in deployment to discover optimal strategies, even if initial greedy exploitation matches baselines. This method ensures better alignment with diverse student needs and enhances learning outcomes.

Key insights

Adaptive prompt routing with subject-aware LLM evaluation improves tutoring efficiency and student engagement in diverse subjects.

Principles

Subject-aware prompting prevents representation degeneration in LLM embeddings.
Feature-based AI feedback, calibrated with human labels, provides robust pedagogical signals.
Sim-to-real transfer requires reward calibration to align simulated and real-world distributions.

Method

Train a prompt routing model using PPO in a simulated environment with an LLM evaluator, then fine-tune with real-world data, employing exploitation/exploration modes for deployment.

In practice

Utilize hybrid input representations for effective subject differentiation.
Implement stochastic prompt sampling during deployment to discover engaging strategies.
Calibrate AI feedback scores to bridge the sim-to-real gap in educational settings.

Topics

LLM Tutoring Systems
Adaptive Prompting
Contextual Bandits
Sim-to-Real Transfer
Pedagogical AI
Student Engagement

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.