QCon London 2026: Reliable Retrieval for Production AI Systems
Summary
Lan Chu, AI Tech Lead at Rabobank, shared insights from deploying a production AI search system internally used by over 300 users across 10,000 documents, built on a typical RAG pipeline for insight extraction. The system faced production challenges related to document quality, retrieval relevance, and evaluation, which were addressed through specific architectural and methodological improvements. Solutions included a parsing pipeline combining traditional text extraction with visual-language models for complex layouts, and section-based chunking for content processing. Retrieval was enhanced with temporal scoring and a routing layer to improve context beyond vector similarity, while robust evaluation frameworks utilizing real user queries were deemed essential for reliable performance. These efforts underscore that effective AI search systems demand careful attention to accurate parsing, validated chunking, context-aware retrieval, and structured evaluation.
Key takeaway
Deploying a production RAG system for 10,000 enterprise documents demands advanced strategies beyond typical architectures. Rabobank's experience shows success hinges on combining visual-language models for accurate parsing, data-validated chunking, and augmenting vector retrieval with temporal scoring and routing layers. Robust evaluation using real user queries and tracking specific failure modes is critical for achieving reliable, scalable internal knowledge retrieval.
Topics
- RAG Pipeline
- AI Search Systems
- Document Retrieval
- Visual-Language Models
- Production AI
Best for: AI Engineer, MLOps Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by InfoQ.