EPIG: Emotion-Based Prompting for Personalised Image Generation

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, extended

Summary

EPIG (Emotion-Based Prompting for Personalised Image Generation) is a novel, training-free method designed to enhance emotional expressiveness in text-to-image diffusion models by enriching prompts prior to image generation. This lightweight solution leverages psychologically informed valence-arousal emotion representations and a structured, role-aware prompt enrichment pipeline. EPIG decomposes input prompts into subject, stimulus, and context, then maps target emotional states to relevant descriptors from the NRC VAD lexicon using Euclidean proximity. These descriptors are then injected into specific syntactic roles, guiding the generative process towards emotionally coherent visual outputs without modifying or retraining the underlying diffusion model like SDXL-Turbo. Experimental results on a 10-prompt benchmark demonstrate EPIG significantly reduces mean arousal error by 14% against naive insertion and 12% against LLM-based expansion, with a 17% reduction for subject-centric prompts. It also preserves valence alignment and semantic content, making it suitable for resource-constrained and personalized scenarios.

Key takeaway

For prompt engineers or ML engineers aiming for precise emotional control in text-to-image generation, you should integrate EPIG's training-free, role-aware prompt enrichment. This method significantly improves arousal alignment and preserves semantic content, particularly for subject-centric prompts. It allows you to generate emotionally consistent visuals for applications like therapeutic visualization or personalized content, avoiding the computational overhead and affective drift of other methods. Be aware it is most effective with clear emotional subjects.

Key insights

Emotionally coherent image generation is achievable via prompt-level, role-aware descriptor injection using VAD lexicons.

Principles

Method

EPIG decomposes prompts, maps target emotions to NRC VAD lexicon descriptors via Euclidean distance, and inserts subject- or context-centric terms into specific syntactic roles.

In practice

Topics

Code references

Best for: AI Engineer, Computer Vision Engineer, Research Scientist, AI Scientist, Prompt Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.