Multi-View Attention Multiple-Instance Learning Enhanced by LLM Reasoning for Cognitive Distortion Detection

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, AI for Mental Health Applications · Depth: Expert, quick

Summary

A new framework has been developed for the automatic detection of cognitive distortions, which are often linked to mental health disorders. This framework integrates Large Language Models (LLMs) with a Multiple-Instance Learning (MIL) architecture to improve interpretability and expression-level reasoning. It operates by decomposing each utterance into Emotion, Logic, and Behavior (ELB) components. LLMs then process these components to infer multiple distortion instances, each assigned a predicted type, expression, and a model-assigned salience score. These instances are subsequently integrated using a Multi-View Gated Attention mechanism for final classification. Experimental evaluations on both Korean (KoACD) and English (Therapist QA) datasets indicate that incorporating ELB components and LLM-inferred salience scores significantly enhances classification performance, particularly for distortions with high interpretive ambiguity. The dataset and implementation details are publicly available.

Key takeaway

For research scientists developing NLP solutions for mental health, this framework offers a robust approach to tackle the inherent ambiguity in cognitive distortion detection. By integrating LLMs with a MIL architecture and ELB components, you can achieve more accurate and interpretable results, especially for nuanced distortions. Consider adopting this psychologically grounded method to enhance fine-grained reasoning in your mental health NLP applications, leveraging the publicly available dataset and implementation for rapid prototyping.

Key insights

Combining LLMs with MIL and ELB components improves cognitive distortion detection by enhancing interpretability and reasoning.

Principles

Method

The method involves decomposing utterances into Emotion, Logic, and Behavior (ELB) components, using LLMs to infer distortion instances with type, expression, and salience, and then integrating these instances via Multi-View Gated Attention for classification.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.