DiscoTrace: Representing and Comparing Answering Strategies of Humans and LLMs in Information-Seeking Question Answering

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, quick

Summary

DiscoTrace is a new method designed to identify and represent the rhetorical strategies employed by both humans and Large Language Models (LLMs) when answering information-seeking questions. This method models answers as a sequence of discourse acts related to the question, coupled with interpretations of the original question, built upon rhetorical structure theory parses. Applying DiscoTrace to answers from nine distinct human communities revealed significant diversity in their preferred answer construction strategies. Conversely, LLMs demonstrated a lack of rhetorical diversity in their responses, even when explicitly prompted to emulate specific human community guidelines. Furthermore, LLMs consistently favored breadth, addressing question interpretations that human answerers typically chose to omit.

Key takeaway

For AI Engineers developing LLM-based question answering systems, you should recognize that current LLMs lack the rhetorical diversity of human communities and tend to over-address question interpretations. Your development efforts should focus on integrating context-aware strategies to enable LLMs to produce more pragmatically appropriate and diverse answers, rather than simply mimicking human guidelines.

Key insights

DiscoTrace reveals LLMs lack rhetorical diversity and over-address question interpretations compared to human answerers.

Principles

Method

DiscoTrace represents answers as sequences of question-related discourse acts and question interpretations, annotated on rhetorical structure theory parses.

In practice

Topics

Best for: AI Engineer, Machine Learning Engineer, Research Scientist, AI Scientist, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.