MoCA-Agent: A Market-of-Claims Code Agent for Financial and Numerical Reasoning

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, quick

Summary

MoCA-Agent is a novel market-of-claims code agent designed to enhance accuracy in financial and numerical reasoning tasks, where precise grounding in facts, formulas, and units is critical. This system addresses the issue of subtle errors producing plausible but incorrect results by replacing free-form multi-agent debate with claim-level verification. MoCA-Agent decomposes questions into typed atomic claims, uses specialist "trader agents" to evaluate these claims, and synthesizes an executable Python program from the accepted evidence. A code-aware verifier then checks the program for consistency and errors, allowing for one market-aware repair round. Using a fixed Qwen3.6-27B backbone, MoCA-Agent achieved strong performance across ten public benchmarks, including 78.3% on FinQA, 76.0% on FinanceMath, 71.2% on MultiHiertt, 86.9% on ESGenius, and an 85.6% average on FinChart-Bench. This approach demonstrates that aggregating evidence at the atomic claim level significantly improves robustness in high-stakes numerical reasoning.

Key takeaway

For AI Scientists developing financial reasoning systems, you should consider claim-level verification to enhance accuracy and robustness. This approach, demonstrated by MoCA-Agent's strong benchmark performance, mitigates errors from misread cells or incorrect operations. Implement a system that decomposes questions into atomic claims and uses confidence-weighted evidence aggregation to synthesize verifiable code. This can significantly improve the reliability of your high-stakes numerical reasoning applications.

Key insights

MoCA-Agent uses a market-of-claims approach for robust financial and numerical reasoning.

Principles

Method

MoCA-Agent decomposes questions into typed atomic claims, uses specialist agents to "buy or sell" claims, clears orders into accept/reject decisions, synthesizes a Python program, and verifies it with a code-aware verifier.

In practice

Topics

Code references

Best for: AI Engineer, NLP Engineer, AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.