The Agentic Gap: What Enterprises Think vs. What Actually Works With Jeff Dalton

2026-04-10 · Source: AI Explained · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Advanced, extended

Summary

Jeff Dalton, Head of AI and Chief Scientist at Valence, discussed agentic system design, evaluation, and memory management in AI coaching systems. He highlighted the enduring relevance of classical AI theory, emphasizing that agent fundamentals like planning, action, observation, and state maintenance remain unchanged, though their implementation has evolved with large language models. Dalton detailed Valence's AI coach, Nadia, which aims to improve user performance over time by optimizing for long-term value, introducing "productive friction," and personalizing interactions based on organizational values and user profiles. He stressed the importance of an "eval-first" approach, using structured prompts as code, human-calibrated rubrics for subjective evaluations, and a multi-layered "defense-in-depth" strategy for guardrails to ensure safety and address domain-specific challenges in enterprise deployments.

Key takeaway

For AI Engineers building complex agentic systems, you should integrate an "eval-first" mindset and structured prompt design to ensure traceability and debuggability. Focus on defining clear objective functions and using human-calibrated rubrics for subjective evaluations, especially in coaching or personalized AI, to accurately measure long-term impact and user growth. Consider a defense-in-depth approach for guardrails to manage diverse enterprise deployment risks and maintain user trust.

Key insights

Classical AI agent principles remain vital, with modern LLMs enabling new implementation methods and complex agentic systems.

Principles

Agent fundamentals (plan, act, observe, state) are timeless.
Evaluation must precede prompt engineering.
Memory is a first-class object in coaching systems.

Method

Design agentic systems with structured prompts as code, enabling traceability and inspection. Employ human-calibrated rubrics for subjective evaluation, focusing on process quality, outcome quality, and user progress over time.

In practice

Start with simple prompts and small data sets for initial evaluation.
Implement multi-layered guardrails for robust safety.
Prioritize user privacy and control over personal data.

Topics

Agentic Systems
AI Coaching
Evaluation Methodologies
Guardrail Systems
Memory Management

Best for: AI Scientist, AI Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI Explained.