When Does Delegation Beat Majority? A Delegation-Based Aggregator for Multi-Sample LLM Inference

2026-06-06 · Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Expert, quick

Summary

Propagational Proxy Voting (PPV) is introduced as a novel unsupervised aggregator for multi-sample LLM inference, outperforming traditional majority voting. This new consensus rule achieved a +1.5 percentage point (pp) improvement overall on MMLU-Pro, and a +2.24 pp gain on its non-trivial subset, with a paired McNemar p ~ 1.0e-14 (n = 8,099). PPV leverages two previously discarded signals from each sample: within-group letter entropy and between-group reasoning geometry. It employs "WHEN" and "WHOM" levers, where "WHEN" (self-weight) is driven by letter entropy and "WHOM" (peer delegation) by per-question-centered embedding cosine. The method operates without gold labels or auxiliary training, processing 128 sampled generations partitioned into 16 groups per question. It computes each group's semantic entropy and reasoning embedding centroid, feeding these into a stochastic delegation matrix to determine the consensus answer. An example illustrates PPV overturning a 10-6 majority by identifying geometric incoherence in the majority cluster (mean within-cluster cosine -0.02) versus a tight minority (+0.26).

Key takeaway

For Machine Learning Engineers optimizing multi-sample LLM inference, consider implementing Propagational Proxy Voting (PPV) instead of simple majority voting. Your aggregation strategy can significantly improve accuracy, achieving +1.5 pp on MMLU-Pro, by incorporating signals like within-group letter entropy and between-group reasoning geometry. This unsupervised method requires no auxiliary training, offering a direct path to more robust LLM consensus. Evaluate PPV to enhance the reliability of your LLM outputs, particularly in critical applications where nuanced reasoning is paramount.

Key insights

PPV beats majority voting in LLM inference by using delegation based on entropy and reasoning geometry.

Principles

Majority voting discards valuable LLM signals.
Delegation can improve LLM consensus accuracy.
Geometric coherence indicates answer reliability.

Method

Partition 128 LLM generations into 16 groups, compute letter entropy and reasoning embedding centroids, then feed into a stochastic delegation matrix for consensus.

In practice

Apply PPV for unsupervised LLM aggregation.
Use letter entropy to weigh self-picks.
Employ embedding cosine for peer delegation.

Topics

LLM Inference
Multi-Sample Aggregation
Propagational Proxy Voting
Majority Voting
MMLU-Pro Benchmark
Reasoning Geometry
Letter Entropy

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.