Who decides what AI tells you? Campbell Brown, once Meta’s news chief, has thoughts

· Source: AI News & Artificial Intelligence | TechCrunch · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation, AI Ethics & Governance · Depth: Intermediate, short

Summary

Campbell Brown, former Facebook news chief, founded Forum AI 17 months ago to address the accuracy and bias issues in foundation models, particularly concerning "high-stakes topics" like geopolitics, mental health, and finance. Forum AI develops benchmarks by recruiting world-renowned experts, including Niall Ferguson and former Secretary of State Tony Blinken, to evaluate AI model performance. The company aims for AI judges to achieve approximately 90% consensus with human experts, a threshold it claims to have met. Initial evaluations revealed issues such as Gemini pulling from Chinese Communist Party websites and a left-leaning political bias across most models. Brown emphasizes that while foundation model companies prioritize coding and math, accuracy in information is crucial, especially given AI's potential to become the primary information funnel.

Key takeaway

For CTOs and VPs of Engineering evaluating AI solutions for enterprise applications, you should prioritize expert-driven evaluation and compliance frameworks that go beyond basic checkbox audits. The current "joke" of a compliance landscape, as seen in NYC's hiring bias law, indicates that smart generalists and standard benchmarks are insufficient. Insist on domain-specific expertise to identify and mitigate subtle failures and edge cases, ensuring your AI systems meet stringent accuracy and liability requirements.

Key insights

AI models exhibit significant accuracy and bias issues on high-stakes topics, necessitating expert-driven evaluation.

Principles

Method

Forum AI recruits domain experts to architect benchmarks, then trains AI judges to evaluate foundation models at scale, aiming for 90% consensus with human expert judgments on nuanced topics.

In practice

Topics

Best for: CTO, VP of Engineering/Data, Executive, AI Ethicist, Policy Maker, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI News & Artificial Intelligence | TechCrunch.