SocialPersona: Benchmarking Personalized Profiling and Response with Multimodal Social-Media Context
Summary
SocialPersona is a new benchmark designed to evaluate multimodal large language models (MLLMs) on their ability to infer user preferences from longitudinal social-media timelines and apply these in dialogue. Unlike evaluations focused on explicit memory, SocialPersona addresses comprehensive personalization by using "revealed preferences" from natural user traces. It comprises data from 171 everyday social-media users, including text, images, and timestamps, alongside 2,597 human-verified preference tags across seven interest domains. The benchmark supports tasks like structured user profile construction and personalized response generation. Initial experiments show MLLMs identify broad interests but struggle with fine-grained or recent preferences, especially when personalizing dialogue. Text and images offer complementary signals, highlighting ongoing challenges in robust cross-modal, long-horizon user modeling.
Key takeaway
For AI scientists and ML engineers developing personalized assistants, SocialPersona highlights critical gaps in current MLLM capabilities. You should prioritize robust cross-modal, long-horizon user modeling to infer fine-grained and recent interests from multimodal social data. Focus on improving how inferred profiles translate into genuinely personalized dialogue, as this remains a significant performance bottleneck.
Key insights
SocialPersona benchmarks MLLMs' ability to infer and act on user preferences from multimodal social media data for personalized dialogue.
Principles
- Comprehensive personalization requires inferred preferences.
- Multimodal data offers complementary preference signals.
- Long-horizon user modeling remains a key challenge.
Method
SocialPersona constructs user timelines from 171 social-media users, annotates 2,597 preferences across seven domains, and supports tasks for profile construction and personalized response generation.
In practice
- Evaluate MLLMs on fine-grained interest inference.
- Test cross-modal preference signal integration.
- Benchmark dialogue personalization with inferred profiles.
Topics
- Multimodal LLMs
- User Profiling
- Personalized Response Generation
- Social Media Analysis
- Preference Inference
- Benchmark Datasets
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.