Ontology-constrained multi-LLM scoring of hypothesis support in the predictive processing literature
Summary
A novel multi-LLM pipeline for ontology-constrained literature synthesis in predictive coding neuroscience utilizes a council of ten local language models to score 31 studies against a 36-concept glossary. This glossary defines three hypotheses: predictive suppression, feedforward error propagation, and ubiquity, evaluated across local and global oddball contexts. The pipeline extracts evidence, incorporates figure descriptions via a vision-language model, and validates outputs. Results indicate higher agreement for predictive suppression (Mean = 0.46) and feedforward error propagation (Mean = 0.51) in local oddball contexts, but weaker support for ubiquity (Mean = 0.23) and generally lower scores in global oddball contexts. A new "hypothesis-space temperature" metric quantifies dispersion, showing greater variability in global oddball contexts (0.00348) compared to local oddball contexts (0.00114). This framework provides auditable disagreement measurements, mapping heterogeneous literatures into quantitative evidence spaces.
Key takeaway
For research scientists synthesizing fragmented interdisciplinary literature, you should consider implementing a local multi-LLM council with an expert-guided ontology. This framework provides auditable, quantitative mapping of hypothesis support, revealing structured agreement and disagreement across studies. It allows you to measure inter-model variability and track literature dynamics over time, offering a powerful tool for cumulative theory-building beyond traditional meta-analysis limitations.
Key insights
A multi-LLM council with an expert-guided ontology can quantitatively map scientific literature's support for hypotheses, revealing structured agreement and disagreement.
Principles
- Multi-LLM councils quantify disagreement as an analytical tool.
- Expert-guided ontologies and instruction layers constrain LLM outputs.
- Hypothesis-space temperature measures theoretical consensus or volatility.
Method
The pipeline reads papers, extracts evidence including figure descriptions, assembles ontology-constrained prompts, and validates outputs against an expert glossary using a council of local LLMs.
In practice
- Use a multi-LLM council to measure inter-model variability.
- Define a precise glossary for LLM-assisted literature review.
- Map literature into a quantitative hypothesis space.
Topics
- Large Language Models
- Literature Synthesis
- Predictive Coding Neuroscience
- Ontology Engineering
- Multi-Agent Systems
- Computational Neuroscience
Code references
Best for: AI Scientist, Research Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.