Beyond Prompting: Using Agent Skills in Data Science

2026-04-17 · Source: Towards Data Science · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Software Development & Engineering · Depth: Intermediate, medium

Summary

The article introduces "skills" as reusable instruction packages for integrating Large Language Models (LLMs) into data science workflows, building on a previous discussion of MCP. A skill, defined by a `SKILL.md` file with metadata and instructions, can include scripts, templates, and examples to standardize AI-driven tasks. The author demonstrates this concept by automating a weekly data visualization process, which previously took one hour, using two custom skills: a "storytelling-viz" skill for analysis and visualization generation, and a "viz-publish" skill for website deployment. This automation, performed using Codex Desktop with an Apple Health dataset from Google BigQuery, reduced the process to under 10 minutes, generating an insight-driven interactive visualization. The author details a two-step skill development process involving initial planning with AI and iterative refinement through personal knowledge integration, external resource research, and extensive testing with over 15 datasets.

Key takeaway

For Data Scientists seeking to automate repetitive, domain-specific tasks, consider developing "skills" for your LLM-integrated workflows. This approach allows you to package complex, multi-step processes into reusable components, significantly reducing execution time and improving consistency. You should define clear instructions, integrate your expertise, and iteratively test and refine these skills to achieve optimal, reliable automation, especially for tasks that are difficult to handle with a single prompt.

Key insights

Skills package instructions and resources for LLMs, enabling reliable automation of recurring data science workflows.

Principles

Skills keep main LLM context shorter.
Iterative refinement improves skill performance.
Modular skills enhance reusability.

Method

Develop skills by planning with AI, then iteratively refine by integrating personal knowledge, researching external resources, and testing with diverse datasets to identify and address shortcomings.

In practice

Automate repetitive, semi-structured data science tasks.
Split complex workflows into independent skills.
Combine skills with MCP for tool access and process adherence.

Topics

AI Agent Skills
Data Science Automation
Large Language Models
Data Visualization Workflow
Skill Development

Code references

yudong-94/storytelling-viz-skill

Best for: Data Scientist, AI Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Towards Data Science.