Closing the knowledge gap with agent skills

· Source: Google Developers Blog - AI · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Robotics & Autonomous Systems · Depth: Intermediate, short

Summary

Google DeepMind developed and evaluated a "skill" for the Gemini API to address the knowledge gap in large language models (LLMs) regarding rapidly evolving software engineering practices and SDK changes. This skill, available on GitHub, explains API features, describes current models and SDKs, demonstrates sample code, and lists documentation entry points. An evaluation harness with 117 Python and TypeScript code generation prompts was used to test the skill's performance. Results showed that the latest Gemini 3 series models, specifically 3.0 Pro, 3.0 Flash, and 3.1 Pro, achieved significantly improved pass rates (from 6.8% and 28% without the skill to much higher with it), while older 2.5 series models benefited less. The skill was effective across most domains, though "SDK Usage" had the lowest pass rate at 95%.

Key takeaway

For AI Architects designing agentic coding systems, integrating agent skills like the Gemini API developer skill can significantly improve model performance and accuracy when dealing with dynamic SDKs and evolving best practices. You should consider implementing similar skills to provide LLMs with up-to-date information, especially for newer models with strong reasoning capabilities. Be mindful of skill update mechanisms to prevent outdated information from accumulating in user environments.

Key insights

Agent skills effectively bridge LLM knowledge gaps for rapidly changing software development practices.

Principles

Method

A skill was built to explain API features, describe models/SDKs, demonstrate code, and list documentation. Performance was evaluated using 117 code generation prompts in "vanilla" and skill-enabled modes.

In practice

Topics

Code references

Best for: AI Architect, AI Engineer, Machine Learning Engineer, Prompt Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Google Developers Blog - AI.