Korean Culture into LLM Alignment: Toward Cultural Coherence

2026-06-08 · Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, extended

Summary

A new alignment-data pipeline addresses the challenge of integrating Korean culture into large language models (LLMs), moving beyond negative output suppression to define culturally coherent responses. This pipeline, instantiated for Korean, utilizes a prompt-based LLM seed generator to expand a Korean harm taxonomy. At its core is a Korean-culturally-adapted safe-response policy, grounded in Korean legal frameworks, social norms, and interpretive conventions. DPO fine-tuning on 10,000 resulting triplets significantly improved the Korean cultural safe rate across six open-weight LLMs, including A.X-4.0-Light, EXAONE-3.5, Kanana-1.5, Qwen-2.5, Gemma-3, and Llama-3.1, with an average gain of +6.59 points on Korset. Crucially, this enhancement caused no large degradation on Korean general-capability benchmarks, and fine-tuned models provided constructive, context-specific information, often citing Korean statutes.

Key takeaway

For AI Scientists and Machine Learning Engineers deploying LLMs in specific cultural regions, prioritizing local cultural coherence is critical. Your current global alignment data may be insufficient, leading to culturally inappropriate or over-refusal responses. You should adopt a constructive, locale-specific alignment pipeline that integrates local legal frameworks, social norms, and interpretive conventions. This approach, demonstrated for Korean, enhances cultural safety and helpfulness without compromising general capabilities, ensuring your models are genuinely useful and trusted in target markets.

Key insights

LLM cultural alignment requires constructive, locale-specific definitions of coherence, moving beyond mere output suppression.

Principles

Cultural alignment must define what is coherent, not just what to avoid.
Queries and responses should be tightly grounded in the target culture.
Refusals must be grounded in local norms, avoiding superficiality.

Method

A pipeline generates DPO triplets by expanding a Korean harm taxonomy, using an attacker LLM for hard-case mining, and a multi-model safe-response generator conditioned on a Korean-adapted policy, filtered by a three-judge ensemble.

In practice

Develop harm taxonomies anchored in local legal/social contexts.
Ensure responses name applicable local statutes and norms.
Offer constructive, context-relevant alternatives alongside refusals.

Topics

LLM Alignment
Cultural Coherence
Korean NLP
DPO Fine-tuning
AI Safety
Harm Taxonomy

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.