Tensor-based second-order causal discovery
Summary
Tensor-based Second-order Causal Discovery (TSCD) is a new algorithm designed to uncover causal dependencies among variables. Developed by Nathan Ouyang, Kexin Wan, and Anna Seigal, TSCD takes as input a tensor constructed from the covariance matrices of both observational and interventional data. It assumes a linear structural equation model on a directed acyclic graph (DAG) with uncorrelated noise variables, outputting the DAG and its edge functions. A version for nonlinear models is also implemented. The algorithm's focus on second-order statistics offers statistical and computational efficiency, ensures identifiability over first-order statistics, and works regardless of whether variables are Gaussian. TSCD demonstrates identifiable causal order and parameters using a number of interventions logarithmic in the number of variables, proving robust to noise, competitive with existing methods, and scalable to hundreds of variables. Code is available on GitHub.
Key takeaway
For research scientists evaluating causal discovery algorithms, TSCD offers a compelling option. Its reliance on second-order statistics provides computational efficiency and identifiability, even with non-Gaussian variables. You should consider TSCD for projects involving hundreds of variables, as it scales effectively with a logarithmic number of interventions. The availability of its code further simplifies implementation, allowing you to quickly test its robustness against noise in your specific datasets.
Key insights
TSCD uses second-order statistics from observational and interventional data to efficiently discover causal graphs and functions, scaling to hundreds of variables.
Principles
- Second-order statistics offer efficiency and identifiability.
- Causal discovery can scale logarithmically with interventions.
- Covariance matrices work for non-Gaussian variables.
Method
TSCD constructs a tensor from observational and interventional covariance matrices. It then identifies the DAG and edge functions assuming a linear structural equation model with uncorrelated noise, also supporting nonlinear models.
In practice
- Apply TSCD for robust causal discovery.
- Use TSCD code for linear or nonlinear models.
- Consider TSCD for large variable sets.
Topics
- Causal Discovery
- Tensor Methods
- Second-Order Statistics
- Directed Acyclic Graphs
- Structural Equation Models
- Interventional Data
Code references
Best for: AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.