Grammar of the Wave: Towards Explainable Multivariate Time Series Event Detection via Neuro-Symbolic VLM Agents
Summary
"Grammar of the Wave" introduces Knowledge-Guided Time Series Event Detection (K-TSED), a new task addressing the challenge of identifying semantically complex events in multivariate signals with limited labeled data. The authors propose Event Logic Tree (ELT), a novel knowledge representation framework that translates natural language event descriptions into hierarchical temporal-logic structures. This framework underpins SELA, a neuro-symbolic VLM agent system that iteratively instantiates signal primitives and composes them according to ELT constraints, providing both event detection and faithful explanations. To validate this approach, a new benchmark, KITE, was created using 41 real-world energy production time series datasets with expert annotations. Experiments demonstrate SELA's superior performance over supervised fine-tuning baselines and existing zero-shot LLM/VLM methods like Numeric and VL-Time, achieving F1@0.5 scores close to human experts and effectively mitigating VLM hallucination.
Key takeaway
For AI Scientists and Machine Learning Engineers developing explainable time series event detection in low-resource, high-stakes domains, you should explore neuro-symbolic VLM agent frameworks like SELA. This approach, leveraging Event Logic Trees, enables robust zero-shot detection from natural language descriptions, significantly improving performance and providing verifiable explanations. It also effectively mitigates VLM hallucination, fostering greater trust in automated event identification.
Key insights
Neuro-symbolic VLM agents utilize Event Logic Trees for explainable, zero-shot time series event detection from natural language descriptions.
Principles
- Events require hierarchical, semantically quantified, and temporally elastic representation.
- A physical channel supports only one active primitive state at any time point.
- Temporal conjunctions (SEQ, SYNC, GUARD) and disjunction (OR) form a complete operator basis.
Method
SELA uses Logic Analyst agents to build ELT schemas from text and Signal Inspector agents to instantiate them on time series data via active visualization tools.
In practice
- Utilize ELT to translate natural language event descriptions into structured temporal logic.
- Employ active visualization tools to refine time series interval detection.
- Benchmark K-TSED solutions using the KITE dataset from energy production.
Topics
- Knowledge-Guided TSED
- Neuro-Symbolic AI
- Vision-Language Models
- Event Logic Tree
- Multi-Agent Systems
- Explainable AI
- KITE Dataset
Best for: AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.