How we built an internal data analytics agent
Summary
GitHub has developed Qubot, an internal Copilot-powered analytics agent, to provide self-serve data access for its employees, known as Hubbers. Launched on June 19, 2026, Qubot allows users to ask natural language questions about GitHub's data warehouse and receive answers within seconds, accessible via Slack, VS Code, and the Copilot CLI. Its architecture comprises a user interface, a federated context layer, and a query engine connecting to Kusto and Trino. The context layer, enriched by product and analytics teams, is crucial for Qubot's accuracy and speed, making it three times faster. An offline evaluation framework, using curated test cases and automated runs, ensures accuracy and catches regressions. Qubot has seen wide adoption, with hundreds of users running thousands of queries, significantly reducing reliance on dedicated analytics support and centralizing distributed data knowledge.
Key takeaway
For AI Engineers or MLOps teams considering internal data analytics solutions, implementing a Copilot-powered agent like Qubot can significantly democratize data access. Your team can reduce the burden on data analysts by enabling self-serve exploration, especially for exploratory questions. Focus on building a robust, federated context layer and an evaluation framework to ensure accuracy and performance, centralizing distributed knowledge and driving faster decision-making across your organization.
Key insights
A Copilot-powered analytics agent can provide self-serve data access, reducing reliance on dedicated support.
Principles
- Federated context layers enhance agent accuracy and speed.
- Standardized templates streamline context contribution.
- Offline evaluation frameworks are critical for agent quality.
Method
Build an analytics agent with a UI, a federated context layer (bronze, silver, gold data), and a query engine (Kusto/Trino) that automatically selects the appropriate backend.
In practice
- Integrate agents into Slack, VS Code, and CLI for accessibility.
- Use pull requests for context layer changes.
- Benchmark agent performance with curated test cases.
Topics
- AI Agents
- Data Analytics
- GitHub Copilot
- Self-serve Data
- Evaluation Frameworks
- Trino
- Kusto
Best for: AI Engineer, MLOps Engineer, Director of AI/ML
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by The GitHub Blog.