Lessons Learned from Building Agentic Systems With Jayeeta Putatunda

2025-08-16 · Source: AI Explained · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Data Science & Analytics · Depth: Advanced, extended

Summary

Jayeeta Putatunda, Director of AI Center of Excellence at Fitch Group, discusses critical lessons from building and deploying AI agent systems. She highlights the challenges of moving from proof-of-concept to production, emphasizing the 80-20 rule for focusing on high-impact, low-effort use cases and defining specific evaluation metrics beyond general productivity gains. Putatunda differentiates between workflow-like and autonomous agents, noting that financial applications often require a balance between autonomy and human oversight due to high-stakes data. She stresses the importance of robust testing, data preparation, and continuous observability, advocating for versioning prompts and evaluation outputs. The discussion also covers diagnosing failures through granular logging and the necessity of integrating traditional ML models and causal AI to ground non-deterministic LLM outputs, especially in finance where hallucination is intolerable.

Key takeaway

For AI Engineers building agentic systems in finance, prioritize defining clear business problems and specific, measurable evaluation metrics from the outset. Focus on hybrid architectures that ground non-deterministic LLM outputs with established predictive models and causal AI. Implement comprehensive logging, versioning for prompts and evaluation data, and rigorous beta testing to identify edge cases and build stakeholder trust in unpredictable systems, ensuring reliability and mitigating hallucination risks in production.

Key insights

Successful AI agent deployment requires strategic use case selection, rigorous evaluation, and robust observability from concept to production.

Principles

Prioritize high-impact, low-effort use cases (80-20 rule).
Define specific, measurable evaluation metrics beyond general productivity.
Integrate human oversight and traditional ML for high-stakes applications.

Method

Implement granular logging at every agent step, version prompts and evaluation outputs, and conduct beta testing to identify edge cases before production deployment.

In practice

Use hybrid models combining LLMs with predictive ML for financial data.
Log all tool calls, token usages, and response times for traceability.
Collaborate closely with business stakeholders to define metrics and risks.

Topics

AI Agent Systems
Generative AI Deployment
Evaluation Metrics
AI Observability
Hybrid AI Architectures

Best for: AI Engineer, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI Explained.