fMRI-LM: Towards a Universal Foundation Model for Language-Aligned fMRI Understanding

· Source: cs.AI updates on arXiv.org · Field: Science & Research — Health & Medical Research, Life Sciences & Biology, Mathematics & Computational Sciences · Depth: Expert, extended

Summary

fMRI-LM is a novel foundational model designed to bridge functional MRI (fMRI) data with large language models (LLMs) for unified brain activity understanding. This model employs a three-stage framework: first, a neural tokenizer maps fMRI signals into discrete, language-consistent tokens; second, a pretrained LLM is adapted to jointly model these fMRI tokens and text, enabling temporal prediction and linguistic description of brain activity; and third, multi-task, multi-paradigm instruction tuning endows fMRI-LM with high-level semantic understanding for diverse applications. To overcome the scarcity of natural fMRI-text pairs, fMRI-LM constructs a large descriptive corpus by translating imaging-based features into structured textual descriptors. The model, pretrained on over 50,000 fMRI scans, demonstrates strong zero-shot and few-shot performance across various benchmarks and adapts efficiently using parameter-efficient tuning (LoRA), establishing a scalable pathway toward a language-aligned, universal model for fMRI interpretation.

Key takeaway

For AI Scientists and Research Scientists working on brain imaging, fMRI-LM offers a scalable approach to integrate fMRI data with language models. You should consider adopting its three-stage framework, particularly its method of generating synthetic fMRI-text descriptors, to enable language-grounded interpretation of neural activity and improve generalization across diverse neuroscience and clinical tasks, even with limited labeled data.

Key insights

fMRI-LM aligns fMRI with LLMs using synthetic text descriptors, enabling unified, language-grounded brain activity understanding.

Principles

Method

fMRI-LM tokenizes fMRI into language-aligned embeddings, fine-tunes an LLM with fMRI-to-fMRI, fMRI-to-Text, and Text-to-Text objectives, then applies multi-task instruction tuning for diverse applications.

In practice

Topics

Best for: AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.