LLM-Conditioned Synthesis of Pathological Gaits via Structured Gait-Language Representations

· Source: cs.CV updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Health & Medical Research · Depth: Expert, medium

Summary

A new multimodal LLM-guided framework synthesizes pathology-aware 3D gait data from structured textual descriptions, addressing the scarcity of real pathological gait datasets. This framework generates fixed-length synthetic skeleton-based gait sequences for classification tasks. It integrates motion tokenisation, pathology-aware language conditioning, LLM-based semantic augmentation, and language-to-gait generation. A core innovation is the pathological tokeniser, which preserves pathology-specific motion characteristics during discrete representation learning. Experiments on the Pathological Gait Dataset by Jun et al. show that combining synthetic data with real data improves downstream classification. A GRU classifier achieved 92.77% accuracy under a leave-one-subject-out protocol, an improvement from 91.08% with real data alone. The framework, using a fine-tuned GPT-2, also outperformed MotionGPT (90.26%) and Qwen-5B (79.86%).

Key takeaway

For Machine Learning Engineers developing models for pathological gait analysis with limited data, you should consider integrating LLM-guided synthetic data generation. This approach, using a pathology-aware tokeniser, can significantly improve recurrent classifier performance, as demonstrated by a GRU achieving 92.77% accuracy. Explore fine-tuning models like GPT-2 with pathology-specific priors to create robust, diverse datasets, enhancing diagnostic and rehabilitation applications.

Key insights

The framework synthesizes pathology-aware 3D gait data using LLMs and a specialized tokeniser, improving classification accuracy for scarce datasets.

Principles

Method

The method encodes 3D gait, tokenizes it spatially, temporally, and pathologically, maps to language, fine-tunes an LLM with pathology priors, generates language tokens, then reconstructs 3D gait via a decoder.

In practice

Topics

Best for: AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CV updates on arXiv.org.