LLM Features Can Hurt GNNs: Concatenation Interference on Homophilous Graph Benchmarks
Summary
A recent study reveals that directly concatenating LLM-generated node features to graph neural networks (GNNs) can unexpectedly decrease accuracy on homophilous graph benchmarks, contradicting widespread reports of improvement. Using an MLP backbone with SBERT-encoded GPT-4o-mini TAPE features on the Planetoid public split, this method reduced PubMed test accuracy by -17.0 +/- 0.3 pp and Cora by -4.3 +/- 0.6 pp. This degradation lessens with different GNN backbones (GCN, GCNII, GAT), random splits, or smaller encoders, and reverses on medium-homophily datasets like WikiCS (+4.4 pp) and ogbn-arxiv (+11.7 pp). The research introduces "Delta_sig", a measure of LLM-alone discriminability, which correlates more strongly with concatenation cost (r^2 = 0.38) than homophily (r^2 = 0.06) across nine datasets. A power law, |Delta_concat| proportional to (sqrt(d_l/n))^1.31 with r^2 = 0.97, further explains the observed performance drops.
Key takeaway
For Machine Learning Engineers integrating LLM features into graph neural networks, you should critically evaluate the impact of simple input concatenation. If your graph datasets are highly homophilous, or if the LLM features exhibit high "Delta_sig" discriminability, direct concatenation may degrade accuracy rather than improve it. Consider alternative integration strategies like joint training or distillation, or carefully assess the "Delta_sig" metric for your specific LLM features and dataset before deployment to avoid performance regressions.
Key insights
Concatenating LLM features to GNNs can degrade accuracy on homophilous graphs, especially with high LLM-alone discriminability.
Principles
- LLM feature concatenation isn't universally beneficial for GNNs.
- High LLM-alone discriminability ("Delta_sig") predicts performance drops.
- Performance degradation follows a power law related to feature dimensions.
Method
The study proposes "Delta_sig", a measure of LLM-alone discriminability, to predict whether concatenating LLM features will help or hurt GNN performance on a given dataset.
In practice
- Evaluate "Delta_sig" before concatenating LLM features to GNNs.
- Avoid simple concatenation on highly homophilous graphs.
- Consider alternative integration methods beyond pure concatenation.
Topics
- LLM Features
- Graph Neural Networks
- Feature Concatenation
- Homophilous Graphs
- Delta_sig Metric
- GPT-4o-mini
Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.