Rhetorical Questions in LLM Representations: A Linear Probing Study

2026-04-16 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

A study by Yao, Anand, Zhuang, and Jiang investigates how large language models (LLMs) internally represent rhetorical questions, which are used for persuasion or signaling stance rather than seeking information. Using linear probes on two social media datasets, RQ (Twitter) and SRAQ (Reddit), the researchers found that rhetorical signals emerge early in LLM layers and are most stably captured by last-token representations in decoder-only models like Qwen3-32B and Llama-3.3-70B. Rhetorical questions are linearly separable from information-seeking questions within datasets, achieving AUROC scores of 0.7-0.8 under cross-dataset transfer. However, despite similar discriminative performance, probes trained on different datasets produce divergent rankings of rhetorical instances, with Jaccard overlap often below 0.2 for top-ranked examples. Qualitative analysis revealed that these divergences correspond to distinct rhetorical phenomena: some probes capture discourse-level rhetorical stance in extended arguments, while others emphasize localized, syntax-driven interrogative acts, suggesting rhetorical questions are encoded by multiple linear directions.

Key takeaway

For NLP engineers and AI scientists developing or evaluating LLMs, recognize that rhetorical questions are not encoded along a single, unified linear dimension. Your models may capture different facets of rhetorical intent depending on training data and probing methods. Therefore, when assessing rhetorical understanding, employ diverse evaluation metrics beyond AUROC, such as rank agreement and qualitative analysis, to ensure comprehensive coverage of both discourse-level stance and localized interrogative acts. This nuanced approach will lead to more robust and context-aware LLM applications.

Key insights

LLMs represent rhetorical questions heterogeneously, with multiple linear directions capturing distinct rhetorical cues.

Principles

Last-token representations offer stable rhetorical signals.
Linear separability does not imply shared representation.
Rhetorical meaning is context-sensitive and heterogeneous.

Method

Linear probing, including diffMean, logistic regression, and hinge-loss classifiers, was applied to PCA-reduced last-token representations from LLMs on social media datasets to analyze rhetorical question encoding.

In practice

Prioritize last-token embeddings for rhetorical intent analysis.
Use multiple probes to capture diverse rhetorical aspects.
Consider context granularity when analyzing rhetorical questions.

Topics

Rhetorical Questions
LLM Representations
Linear Probing
Cross-Dataset Transfer
Last-Token Embeddings

Code references

ruyi101/rq-representation-probing

Best for: AI Scientist, NLP Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.