I thought a different model would catch the plate error. The data said otherwise.
Summary
A common assumption that using a different AI model from another vendor provides an independent "second opinion" for verification is largely incorrect. Research indicates that when two strong models err, they converge on the same wrong answer approximately 60% of the time, far exceeding the ~33% expected from truly independent errors. This high correlation persists even across diverse architectures and vendors, and is often more pronounced in highly capable models due to overlapping training data and shared priors. The article argues that instead of a "second opinion," systems require a "second signal" that fails for fundamentally different reasons. Examples include format constraints, different grounding information like plate state from color, deterministic shape distance analysis, or varied input resolutions, all designed to provide genuinely orthogonal checks.
Key takeaway
For AI Engineers designing robust verification pipelines, relying on a second model from a different vendor for error checking is insufficient. You should instead implement genuinely orthogonal "second signals" that fail for different reasons, such as format constraints or external contextual data. When these independent signals still disagree, especially where misreads carry high costs, your system must abstain. Flag the item for human review, rather than laundering a guess into a confirmed answer.
Key insights
Model errors are highly correlated, even across vendors, requiring orthogonal "second signals" rather than "second opinions" for robust verification.
Principles
- Different models often share biases due to common training data.
- Verification requires genuinely independent failure mechanisms.
- More capable models can exhibit higher error correlation.
In practice
- Implement format or checksum constraints on model outputs.
- Integrate different grounding data, like object state.
- Utilize deterministic shape distance analysis or varied input resolution.
Topics
- Correlated Model Errors
- AI Model Verification
- Orthogonal Signals
- Optical Character Recognition
- MLOps Pipelines
- Uncertainty Quantification
Best for: AI Engineer, Machine Learning Engineer, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AI Advances - Medium.