More Context, Larger Models, or Moral Knowledge? A Systematic Study of Schwartz Value Detection in Political Texts

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing, Emerging Technologies & Innovation · Depth: Expert, extended

Summary

This study systematically investigates Schwartz value detection in political texts, comparing the impact of context, retrieved moral knowledge, and model scale. Researchers found that full-document context improved supervised DeBERTa-v3 encoders by 3.8–4.8 macro-F1 points over sentence-only input, but did not consistently benefit zero-shot LLMs ranging from 12B to 123B parameters. Retrieved moral knowledge, from a 58-chunk curated knowledge base, consistently improved performance across all tested model families and context conditions by 0.014 to 0.036 macro-F1 using early fusion. However, scaling from DeBERTa-v3-base to large, or from 12B to larger LLMs, did not guarantee performance gains. The strongest system was DeBERTa-v3-base with document early-RAG, achieving .314 macro-F1, indicating task supervision's importance over parameter count in this protocol.

Key takeaway

For Machine Learning Engineers developing value-sensitive NLP systems, you should prioritize supervised encoders like DeBERTa-v3-base, carefully selecting context length (e.g., full-document for encoders). Integrate simple early-fusion retrieval-augmented generation (RAG) with a curated moral knowledge base, especially for ambiguous label boundaries. This approach offers better performance and interpretability than larger zero-shot LLMs, while also being more cost-effective and reproducible. Always evaluate performance per value, not just aggregate macro-F1.

Key insights

Value detection performance depends critically on context, external knowledge, and model architecture, not just scale.

Principles

Method

The study compared sentence, window, and full-document inputs, no-RAG vs. retrieval-augmented settings with a 58-chunk moral knowledge base, supervised DeBERTa-v3-base/large encoders, and zero-shot LLMs (12B-123B parameters) using early, late, and cross-attention fusion.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.