Agentic Trading: When LLM Agents Meet Financial Markets

· Source: cs.AI updates on arXiv.org · Field: Finance & Economics — Capital Markets & Investment Management, FinTech & Digital Financial Services · Depth: Expert, extended

Summary

The paper "Agentic Trading: When LLM Agents Meet Financial Markets" provides an audit-oriented evidence map of 77 studies on Large Language Model (LLM)-based trading agents, screened through March 9, 2026. It reframes these agents as expert-system decision pipelines. A primary empirical subset of 19 studies, satisfying "Action Output plus Closed-Loop Evaluation," reveals significant protocol incomparability. Only 2/19 studies report extractable time-consistent split protocols, 1/19 reports an explicit transaction-cost model, 1/19 documents universe/survivorship handling, and 11/19 report execution timing or semantics. Furthermore, 15/19 studies are R0, with none reaching R3 reproducibility. The survey introduces an Architecture–Capability–Adaptation (A-C-A) analytical lens and proposes an evidence ledger, reproducibility audit, and reporting checklist to address these bottlenecks.

Key takeaway

For AI Scientists and MLOps Engineers developing LLM-based trading agents, you must prioritize rigorous evaluation protocols. Your systems should explicitly report time-consistent data splits, transaction cost models, and execution semantics (MR-1 to MR-7). Without providing reproducible artifacts and detailed logs, your performance claims will remain preliminary and incomparable, hindering adoption and trust in real-world financial deployments. Focus on auditability to bridge the gap between architectural innovation and verifiable market impact.

Key insights

LLM-based trading agent research lacks comparable evaluation protocols, hindering reliable performance assessment and reproducibility.

Principles

Method

The paper proposes an audit-oriented evidence mapping approach, categorizing 77 studies into a primary empirical subset (n=19) and background (n=58), then auditing the primary subset for protocol completeness and reproducibility (R0-R3).

In practice

Topics

Best for: Research Scientist, AI Scientist, MLOps Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.