ProfiLLM: Utility-Aligned Agentic User Profiling for Industrial Ride-Hailing Dispatch
Summary
ProfiLLM introduces an agentic Large Language Model (LLM) data pipeline designed for utility-aligned user profiling in industrial ride-hailing dispatch systems, specifically deployed on DiDi's production dispatcher. This system addresses challenges like massive log data exceeding LLM context windows, long-tail users with limited interactions, and ensuring profiles improve downstream prediction utility. ProfiLLM features two modules: Tool-Augmented Global Knowledge Mining, which uses 27 analytical tools to extract global knowledge and user clustering rules, and Utility-Aligned Profile Exploration, which refines candidate profiles via a lightweight utility proxy and DPO fine-tuning. The system achieved up to +6.14% relative AUC improvement in outcome prediction, +4.35% GMV gain in dispatching simulation, and in a 14-day online A/B test, showed +0.47% GMV, +0.33% Completion Rate, and -0.82% Cancel-Before-Accept rate.
Key takeaway
For MLOps Engineers or AI Scientists building dispatch systems, if you are struggling with contextual behavioral signals or scaling LLM profiling, ProfiLLM demonstrates a viable path. Consider implementing agentic LLM pipelines and utility-aligned refinement to significantly improve prediction accuracy and Gross Merchandise Value (GMV) in your production environments. This approach offers a robust framework for integrating LLMs at industrial scale.
Key insights
ProfiLLM leverages agentic LLMs and global knowledge mining for utility-aligned user profiling in industrial ride-hailing dispatch.
Principles
- LLMs can extract semantic features from behavioral logs.
- Utility alignment is crucial for LLM-generated profiles.
- Agentic LLMs can overcome context window limits.
Method
ProfiLLM employs Tool-Augmented Global Knowledge Mining with 27 tools, then Utility-Aligned Profile Exploration to refine profiles via a lightweight proxy and DPO fine-tuning.
In practice
- Mine platform-scale data with LLM agents.
- Cluster users for efficient profiling.
- Refine profiles using downstream utility proxies.
Topics
- Ride-Hailing Dispatch
- Large Language Models
- User Profiling
- Agentic AI
- Machine Learning Operations
- Semantic Feature Extraction
- DiDi
Best for: Research Scientist, AI Engineer, NLP Engineer, AI Scientist, Machine Learning Engineer, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.