Deterministic Event-Graph Substrates as World Models for Counterfactual Reasoning

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

A new class of world models, called "event-graph substrates," represents state as an append-only log of typed RDF triples and answers counterfactual queries by forking the log under structured interventions. This approach offers inspectability at the triple level, supports exact counterfactuals, and transfers across domains without learned components. The substrate, implemented with a 1,400-line CLEVRER-DSL interpreter, significantly outperforms the symbolic-oracle baseline NS-DR on the CLEVRER video causal-reasoning benchmark across descriptive (9.89%), explanatory (20.26%), and counterfactual (17.65%) per-question metrics. It also exceeds the parametric ALOE baseline on descriptive and explanatory tasks. Furthermore, a new "twin-EventLog" benchmark demonstrates the substrate's superior agent memory consistency, surpassing Llama-3.1-8B by 18.80 percentage points and a Park/Concordia-style LLM simulator by 65 percentage points on joint accuracy.

Key takeaway

For research scientists developing agentic systems requiring transparent and precise causal reasoning, consider implementing event-graph substrates. This approach offers formal guarantees on inspectability and replay consistency, outperforming large language models and symbolic baselines on specific causal-reasoning tasks. You should prioritize this method when exact intervention semantics and auditable predictions are critical, especially in closed-event reasoning scenarios.

Key insights

Deterministic replay over typed event deltas enables auditable, exact counterfactual world modeling without learned latent simulation.

Principles

Method

Event-graph substrates use an append-only log of typed RDF triples, a deterministic replay function, and an intervention vocabulary to answer counterfactual queries by forking the log at a specific tick.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Robotics Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.