QCon London 2026: Reliable Retrieval for Production AI Systems

2026-03-17 · Source: InfoQ · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Data Science & Analytics · Depth: Intermediate, quick

Summary

Lan Chu, AI Tech Lead at Rabobank, shared insights from deploying a production AI search system internally used by over 300 users across 10,000 documents, built on a typical RAG pipeline for insight extraction. The system faced production challenges related to document quality, retrieval relevance, and evaluation, which were addressed through specific architectural and methodological improvements. Solutions included a parsing pipeline combining traditional text extraction with visual-language models for complex layouts, and section-based chunking for content processing. Retrieval was enhanced with temporal scoring and a routing layer to improve context beyond vector similarity, while robust evaluation frameworks utilizing real user queries were deemed essential for reliable performance. These efforts underscore that effective AI search systems demand careful attention to accurate parsing, validated chunking, context-aware retrieval, and structured evaluation.

Key takeaway

Deploying a production RAG system for 10,000 enterprise documents demands advanced strategies beyond typical architectures. Rabobank's experience shows success hinges on combining visual-language models for accurate parsing, data-validated chunking, and augmenting vector retrieval with temporal scoring and routing layers. Robust evaluation using real user queries and tracking specific failure modes is critical for achieving reliable, scalable internal knowledge retrieval.

Topics

RAG Pipeline
AI Search Systems
Document Retrieval
Visual-Language Models
Production AI

Best for: AI Engineer, MLOps Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by InfoQ.