How Grab is Using AI Agents to Boost Team Productivity
Summary
Grab's Analytics Data Warehouse (ADW) team, managing over 15,000 tables and serving 1,000 monthly users, developed a multi-agent AI system to automate responses to common data-related queries. This system, built with FastAPI and LangGraph, decouples an LLM "brain" from specialized "hand" agents, each focused on narrow domains like data investigation or code search. It features two pathways: an "investigation pathway" for read-only queries, utilizing Classifier, Data, Code Search, On-call, and Summarizer agents, and an "enhancement pathway" for write operations, handled by a single Enhancement Agent with mandatory human review. The system integrates with internal platforms like Hubble, Genchi, and Lighthouse for metadata, data quality, and pipeline status. Despite initial production challenges like context overflow, tool bloat, and risky code execution, Grab implemented solutions including token management, simplified tool descriptions, multi-layered safety controls, and a human review system with immediate, labeled responses.
Key takeaway
For data engineering teams struggling with high volumes of repetitive support questions, consider implementing a multi-agent AI system to automate investigations and enhancement requests. Your team should prioritize a modular architecture with specialized agents and robust safety layers, especially for write operations, to ensure accuracy and maintain trust. Establish a continuous feedback loop from human reviews to refine agent prompts and guardrails, transforming annotations into an active improvement engine.
Key insights
Specialized multi-agent AI systems can automate complex, repetitive data engineering tasks while maintaining human oversight for high-risk operations.
Principles
- Decouple LLM reasoning ("brain") from tool execution ("hands").
- Specialize agents for narrow domains over monolithic models.
- Apply autonomy levels based on operation risk profile.
Method
Implement a multi-agent AI system with distinct read-only and write-operation pathways. Utilize a supervisor agent for routing, specialized agents for tasks, and a summarizer. Integrate with data catalogs, quality, and pipeline platforms. Employ token management, simplified tools, and multi-layered safety.
In practice
- Use `tiktoken` for real-time token count and context summarization.
- Implement SQL validation for PII, dangerous ops, and partition filters.
- Design human review for AI-generated responses with clear labels.
Topics
- Multi-agent AI Systems
- Data Engineering Automation
- LLM Application Architecture
- AI Safety & Guardrails
- Human-in-the-Loop AI
Best for: AI Engineer, MLOps Engineer, Data Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by ByteByteGo Newsletter.