Lazy or Efficient? Towards Accessible Eye-Tracking Event Detection Using LLMs
Summary
A new large language model (LLM)-driven pipeline has been developed to automate eye-tracking event detection, a process traditionally requiring specialized programming knowledge and careful handling of diverse raw data formats. This code-free framework, developed by Dongyang Guo, Yasmeen Abdrabou, and Enkelejda Kasneci from the Technical University of Munich, converts natural language instructions into an end-to-end analysis workflow. The system automatically infers data structure, generates executable data-cleaning code, implements classical detectors like I-VT and I-DT from user prompts, and provides results with explanatory reports. Evaluated on public benchmarks such as GazeCom, GazeBase_v2, Hollywood2_em, and a pair programming dataset, the approach achieves accuracy comparable to, and in some cases superior to, traditional methods while significantly reducing technical overhead and improving accessibility for non-expert users.
Key takeaway
For research scientists and machine learning engineers working with eye-tracking data, this LLM-driven pipeline offers a significant reduction in technical burden. You can now achieve competitive event detection performance using natural language prompts, bypassing the need for extensive programming expertise and manual parameter tuning. Consider integrating this framework to streamline your data preprocessing and analysis, especially when working with diverse datasets or seeking to improve reproducibility and interpretability.
Key insights
An LLM-driven pipeline automates eye-tracking event detection, making complex analysis accessible without specialized programming.
Principles
- LLMs can translate natural language into executable analysis components.
- Automated data analysis reduces technical barriers and improves reproducibility.
- Iterative prompt refinement enhances model performance and user control.
Method
The pipeline involves LLM-assisted data semantics inference, preprocessing code synthesis, and iterative result diagnosis with feedback, generating and refining classical eye-tracking algorithms like I-VT and I-DT.
In practice
- Use LLMs to infer raw data structure and metadata.
- Generate data cleaning and detector code from natural language prompts.
- Iteratively refine analysis parameters by editing LLM prompts.
Topics
- Eye-Tracking Event Detection
- Large Language Models
- Gaze Data Analysis
- Code Generation
- I-VT Algorithm
Best for: AI Scientist, Research Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.