Do Emotions Influence Moral Judgment in Large Language Models?

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Social Sciences & Behavioral Studies · Depth: Expert, quick

Summary

A new emotion-induction pipeline has been developed to investigate how emotions influence moral judgment in large language models (LLMs). This pipeline infuses specific emotions into moral scenarios, allowing researchers to evaluate shifts in moral acceptability across various datasets and LLMs. The study found a consistent directional pattern where positive emotions generally increase moral acceptability, while negative emotions decrease it. These emotional influences were significant enough to reverse binary moral judgments in up to 20% of cases, with less capable models showing higher susceptibility. Interestingly, some specific emotions, like remorse, occasionally produced effects contrary to their expected valence, paradoxically increasing acceptability. A parallel human annotation study revealed that humans do not exhibit these systematic shifts, highlighting a notable alignment gap in current LLMs.

Key takeaway

For AI Ethicists and Research Scientists developing or deploying LLMs in sensitive applications, you should be aware that current models exhibit predictable, emotion-driven shifts in moral judgment that do not align with human responses. This necessitates rigorous testing for emotional susceptibility and careful consideration of potential biases introduced by emotional context in prompts, especially in domains requiring nuanced ethical reasoning.

Key insights

LLMs exhibit systematic moral judgment shifts based on induced emotions, unlike humans, revealing an alignment gap.

Principles

Method

An emotion-induction pipeline infuses emotion into moral situations, then evaluates shifts in moral acceptability across LLMs and datasets.

In practice

Topics

Best for: AI Scientist, Research Scientist, AI Ethicist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.