MISID: A Multimodal Multi-turn Dataset for Complex Intent Recognition in Strategic Deception Games

2026-04-14 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Gaming & Interactive Media · Depth: Expert, quick

Summary

Researchers introduce MISID, a new multimodal, multi-turn, and multi-participant benchmark dataset designed for complex intent recognition in strategic deception games. This dataset addresses limitations in existing benchmarks by focusing on sophisticated, long-context interactions where participants maintain deceptive narratives over extended periods. MISID features a two-tier, multi-dimensional annotation scheme for discourse analysis and evidence-based causal tracking. Initial evaluations of state-of-the-art Multimodal Large Language Models (MLLMs) on MISID revealed significant deficiencies, including text-prior visual hallucination, impaired cross-modal synergy, and limited causal chaining capabilities. To mitigate these issues, the authors propose FRACTAM, a baseline framework utilizing a "Decouple-Anchor-Reason" paradigm, which improves hidden intent detection and inference while maintaining perceptual accuracy.

Key takeaway

For research scientists developing or evaluating Multimodal Large Language Models, the MISID dataset offers a critical benchmark for assessing performance in complex, multi-turn strategic deception scenarios. You should consider integrating FRACTAM's "Decouple-Anchor-Reason" paradigm to address observed MLLM deficiencies in cross-modal synergy and causal reasoning, potentially enhancing your models' ability to detect hidden intent and improve inference accuracy.

Key insights

MISID dataset and FRACTAM framework advance complex, multi-turn, multimodal intent recognition in strategic deception.

Principles

Complex intent recognition requires long-context analysis.
MLLMs struggle with cross-modal synergy and causal chaining.
Decoupling unimodal facts reduces text bias.

Method

FRACTAM uses a "Decouple-Anchor-Reason" paradigm to extract unimodal factual representations, perform two-stage retrieval for long-range anchoring, and construct explicit cross-modal evidence chains.

In practice

Use MISID to benchmark MLLMs in deception games.
Apply FRACTAM to improve MLLM performance.
Focus on cross-modal evidence chaining for MLLMs.

Topics

MISID Dataset
Complex Intent Recognition
Strategic Deception Games
Multimodal LLMs
FRACTAM Framework

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.