PerceptUI: LLM Agents as Human-Aligned Synthetic Users for UI/UX Evaluation
Summary
PerceptUI is a novel framework that employs Large Language Model (LLM) agents to act as human-aligned synthetic users for UI/UX evaluation. Developed by Nicolas Bougie, Xiaotong Ye, Gian Maria Marconi, and Narimasa Watanabe, this system addresses the slow and costly nature of traditional human-based feedback by predicting specific user responses to interface questions and generating natural-language rationales. Unlike prior multimodal LLM approaches that yield superficial critiques or model-biased judgments, PerceptUI is trained in two distinct stages: first, contrastive reflection fine-tuning distills rationales from human decisions, and second, a reflective prompt-evolution step learns from the model's own failure traces. This two-stage training enables PerceptUI to achieve human-level realism, generalize effectively to unseen questions and personas, and produce accurate population-level response distributions across diverse domains and datasets.
Key takeaway
For UI/UX designers and product managers seeking faster, more cost-effective early-stage evaluation, PerceptUI offers a robust alternative to traditional human studies. You can utilize its persona-conditioned LLM agents to predict specific user responses and generate detailed rationales, significantly accelerating iteration cycles. This approach allows you to simulate diverse user populations and test interface changes with human-level realism before costly A/B tests or participant recruitment.
Key insights
PerceptUI uses two-stage LLM training to create persona-conditioned synthetic users for realistic UI/UX evaluation with natural-language rationales.
Principles
- Human-aligned evaluation requires persona conditioning.
- Distilling human decisions improves model rationales.
- Learning from failures enhances model robustness.
Method
PerceptUI trains in two stages: (i) contrastive reflection fine-tuning distills teacher-generated rationales from human decisions, and (ii) reflective prompt-evolution learns from the model's own failure traces.
In practice
- Generate persona-specific UI/UX feedback.
- Simulate population-level user responses.
- Accelerate early-stage product iteration.
Topics
- UI/UX Evaluation
- LLM Agents
- Persona Simulation
- Multimodal LLMs
- Fine-tuning
- Product Development
Best for: Research Scientist, Product Manager, AI Scientist, Machine Learning Engineer, AI Product Manager
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.