Agentic Trading: When LLM Agents Meet Financial Markets

2026-03-09 · Source: cs.AI updates on arXiv.org · Field: Finance & Economics — Capital Markets & Investment Management, FinTech & Digital Financial Services · Depth: Expert, extended

Summary

The paper "Agentic Trading: When LLM Agents Meet Financial Markets" provides an audit-oriented evidence map of 77 studies on Large Language Model (LLM)-based trading agents, screened through March 9, 2026. It reframes these agents as expert-system decision pipelines. A primary empirical subset of 19 studies, satisfying "Action Output plus Closed-Loop Evaluation," reveals significant protocol incomparability. Only 2/19 studies report extractable time-consistent split protocols, 1/19 reports an explicit transaction-cost model, 1/19 documents universe/survivorship handling, and 11/19 report execution timing or semantics. Furthermore, 15/19 studies are R0, with none reaching R3 reproducibility. The survey introduces an Architecture–Capability–Adaptation (A-C-A) analytical lens and proposes an evidence ledger, reproducibility audit, and reporting checklist to address these bottlenecks.

Key takeaway

For AI Scientists and MLOps Engineers developing LLM-based trading agents, you must prioritize rigorous evaluation protocols. Your systems should explicitly report time-consistent data splits, transaction cost models, and execution semantics (MR-1 to MR-7). Without providing reproducible artifacts and detailed logs, your performance claims will remain preliminary and incomparable, hindering adoption and trust in real-world financial deployments. Focus on auditability to bridge the gap between architectural innovation and verifiable market impact.

Key insights

LLM-based trading agent research lacks comparable evaluation protocols, hindering reliable performance assessment and reproducibility.

Principles

Agentic trading systems require explicit perception-memory-reasoning-action loops.
Reproducibility hinges on transparent protocol reporting, not just headline performance.
Auditability demands grounded, time-stamped tool calls and execution logs.

Method

The paper proposes an audit-oriented evidence mapping approach, categorizing 77 studies into a primary empirical subset (n=19) and background (n=58), then auditing the primary subset for protocol completeness and reproducibility (R0-R3).

In practice

Implement strict time-consistent data splits and embargo rules.
Report explicit transaction costs and execution semantics.
Provide code, data, and immutable logs for reproducibility.

Topics

Large Language Models
Algorithmic Trading
Financial Markets
Agentic AI
Reproducibility
Evaluation Protocols

Best for: Research Scientist, AI Scientist, MLOps Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.