Korean Culture into LLM Alignment: Toward Cultural Coherence

· Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, extended

Summary

A new alignment-data pipeline addresses the challenge of integrating Korean culture into large language models (LLMs), moving beyond negative output suppression to define culturally coherent responses. This pipeline, instantiated for Korean, utilizes a prompt-based LLM seed generator to expand a Korean harm taxonomy. At its core is a Korean-culturally-adapted safe-response policy, grounded in Korean legal frameworks, social norms, and interpretive conventions. DPO fine-tuning on 10,000 resulting triplets significantly improved the Korean cultural safe rate across six open-weight LLMs, including A.X-4.0-Light, EXAONE-3.5, Kanana-1.5, Qwen-2.5, Gemma-3, and Llama-3.1, with an average gain of +6.59 points on Korset. Crucially, this enhancement caused no large degradation on Korean general-capability benchmarks, and fine-tuned models provided constructive, context-specific information, often citing Korean statutes.

Key takeaway

For AI Scientists and Machine Learning Engineers deploying LLMs in specific cultural regions, prioritizing local cultural coherence is critical. Your current global alignment data may be insufficient, leading to culturally inappropriate or over-refusal responses. You should adopt a constructive, locale-specific alignment pipeline that integrates local legal frameworks, social norms, and interpretive conventions. This approach, demonstrated for Korean, enhances cultural safety and helpfulness without compromising general capabilities, ensuring your models are genuinely useful and trusted in target markets.

Key insights

LLM cultural alignment requires constructive, locale-specific definitions of coherence, moving beyond mere output suppression.

Principles

Method

A pipeline generates DPO triplets by expanding a Korean harm taxonomy, using an attacker LLM for hard-case mining, and a multi-model safe-response generator conditioned on a Korean-adapted policy, filtered by a three-judge ensemble.

In practice

Topics

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.