Unifying Ranking and Generation in Query Auto-Completion via Retrieval-Augmented Generation and Multi-Objective Alignment
Summary
A new unified framework addresses challenges in Query Auto-Completion (QAC) by reformulating it as end-to-end list generation. This approach integrates Retrieval-Augmented Generation (RAG) and multi-objective Direct Preference Optimization (DPO) to overcome limitations of traditional retrieve-and-rank pipelines, which struggle with long-tail coverage, and generative methods, which risk hallucination. Key innovations include multi-objective optimization for list generation, a comprehensive methodology combining RAG with learned and rule-based verifiers for synthetic data, and a hybrid serving architecture for efficient production deployment. Evaluated on a large-scale commercial search platform, the framework achieved significant improvements, including +0.40 to +0.69 preference scores in human evaluation, a 5.44% reduction in keystrokes, and a 3.46% increase in suggestion adoption in online experiments.
Key takeaway
For AI Engineers developing search and recommendation systems, this framework offers a production-validated method to enhance QAC. You should consider adopting a RAG-powered, end-to-end generative approach with multi-objective DPO to improve suggestion quality and user engagement, potentially reducing keystrokes by 5.44% and increasing adoption by 3.46% in your own systems.
Key insights
QAC can be effectively reframed as end-to-end list generation using RAG and multi-objective DPO.
Principles
- Combine RAG with multi-objective DPO for QAC.
- Use learned and rule-based verifiers for data quality.
- Employ iterative critique-revision for synthetic data.
Method
Reformulate QAC as end-to-end list generation, integrating RAG, multi-objective DPO, and verifiers. Utilize iterative critique-revision for high-quality synthetic data generation, and deploy with a hybrid serving architecture.
In practice
- Apply RAG to improve long-tail QAC coverage.
- Implement DPO for multi-objective QAC alignment.
- Design hybrid architectures for low-latency QAC.
Topics
- Query Auto-Completion
- Retrieval-Augmented Generation
- Direct Preference Optimization
- Multi-objective Optimization
- End-to-End Generation
Best for: AI Engineer, NLP Engineer, AI Scientist, AI Researcher, Machine Learning Engineer, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Apple Machine Learning Research.