Increase Recommendation Systems’ Precision with LLMs, Using Python
Summary
This article details a two-stage recommendation system designed to enhance precision using Large Language Models (LLMs) while managing computational costs. The system first employs a cheap, rule-based distance filter (Stage 1) to narrow down a dataset of approximately 10,000 restaurants across eight cities to 50 geographically closest candidates. Subsequently, a powerful LLM (Stage 2) reranks these 50 candidates based on a user's natural language query, providing 5 to 10 highly precise recommendations with a "fit_score" and explanation. This hybrid approach balances scalability and intelligence, leveraging LLM capabilities only on a pre-filtered, relevant subset to optimize cost and performance.
Key takeaway
For AI Engineers or Machine Learning Engineers designing intelligent recommendation systems, you should adopt a two-stage funnel to balance LLM costs with precision. By first using a cheap, rule-based filter to generate high-recall candidates, you can then apply an LLM for high-precision reranking on a significantly smaller dataset. This strategy ensures your system is both scalable and intelligent, making efficient use of expensive LLM resources.
Key insights
Optimize LLM-powered recommendation systems by combining cheap, high-recall filtering with expensive, high-precision reranking.
Principles
- System design involves trade-offs, often described by the Accuracy-Scale-Time triangle.
- Initial candidate generation should prioritize high recall and low precision.
- Final selection benefits from high precision models applied to a reduced candidate set.
Method
Generate synthetic restaurant data. Implement a rule-based filter to select N_DISTANCE_CANDIDATES (e.g., 50) based on proximity. Use an LLM with structured output (e.g., Pydantic) to rerank these candidates, providing fit scores and reasons.
In practice
- Use simple, rule-based methods for initial data reduction to minimize LLM API calls.
- Apply LLMs for nuanced interpretation and precise ranking on smaller, pre-filtered datasets.
- Enforce structured LLM responses using tools like Pydantic for reliable parsing.
Topics
- Recommendation Systems
- Large Language Models
- System Design
- Hybrid AI Architectures
- Data Filtering
- Pydantic
- OpenAI API
Code references
Best for: AI Engineer, Machine Learning Engineer, AI Student
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Towards Data Science.