Unsupervised Skill Discovery for Agentic Data Analysis
Summary
DataCOPE is an unsupervised verifier-guided skill discovery framework designed to enhance data-analytic agents by injecting reusable procedural knowledge without updating model parameters. It addresses the challenge of discovering effective data analysis skills from unlabeled exploration by iteratively coordinating a Data-Analytic Agent for trajectory generation, an Unsupervised Verifier for signal extraction, and a Skill Manager for contrastive skill distillation. For report-style analysis, DataCOPE employs an Adaptive Checklist Verifier that refines task-specific criteria and scores reports by verifiable coverage. For reasoning-style analysis, it uses an Answer Agreement Verifier that groups trajectories by answer agreement and leverages self-consistency. Evaluated on Deep Data Research (report-style) and DABStep (reasoning-style), DataCOPE consistently improved held-out performance. Across four matched base models (Claude-Sonnet-4.6, Claude-Sonnet-4.5, GPT-5.2, DeepSeek-V4-Pro, Qwen3.5-397B), it boosted mean scores by 9.71% on report-style tasks and 32.30% on reasoning-style tasks, outperforming baselines like Anthropic's Skill Creator.
Key takeaway
For Machine Learning Engineers developing data-analytic agents, DataCOPE offers a robust method to enhance agent performance without costly manual supervision. You should consider implementing its unsupervised verifier-guided skill discovery framework to distill reusable procedural knowledge from unlabeled exploration. This approach significantly improves accuracy on both report-style and reasoning-style tasks, reducing token consumption and enabling more efficient, generalizable agent behavior across diverse models.
Key insights
DataCOPE enables unsupervised skill discovery for data-analytic agents using verifier-guided iterative refinement of exploration trajectories.
Principles
- Unsupervised verifier signals are crucial for effective skill discovery.
- Iterative refinement improves skill generalization and robustness.
- Appropriate skill granularity is essential for balancing specialization and generalization.
Method
DataCOPE iteratively coordinates a Data-Analytic Agent, an Unsupervised Verifier (Adaptive Checklist or Answer Agreement), and a Skill Manager to distill reusable procedures from grouped exploration trajectories.
In practice
- Implement an Adaptive Checklist Verifier for open-ended report generation tasks.
- Use an Answer Agreement Verifier with self-consistency for fixed-answer reasoning tasks.
- Filter trajectories with saturated self-consistency to focus skill refinement on uncertain cases.
Topics
- Unsupervised Skill Discovery
- Data-Analytic Agents
- Large Language Models
- Adaptive Checklist Verifier
- Answer Agreement Verifier
- Deep Data Research
- DABStep
Code references
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.