RiskNet: A large-scale dataset of AI risk incidents from news with alignment and multi-dimensional annotations
Summary
RiskNet is a new, large-scale dataset designed to track and analyze real-world AI risk incidents, addressing the current scarcity of empirical resources. Constructed from hundreds of millions of multilingual news sources, RiskNet employs a structured pipeline for identifying AI risk news, screening event-level reports, aligning incidents, and performing multi-dimensional classification. This resource organizes dispersed news reports into incident-centered records, offering benchmark datasets for event classification, incident alignment, and incident-level risk labeling. The current release includes aligned incident clusters and annotated benchmark subsets, accessible via an online platform. RiskNet aims to support research in AI safety, governance, risk analysis, and benchmarking, providing a structured resource to bridge the gap between high-level AI governance principles and documented real-world harms.
Key takeaway
For AI scientists and governance researchers developing or evaluating responsible AI frameworks, RiskNet provides an essential empirical foundation. You should integrate this large-scale dataset into your risk analysis and benchmarking efforts to move beyond high-level principles. Utilize its incident clusters and annotated subsets to validate models and inform policy, ensuring your work reflects real-world AI harms.
Key insights
A large-scale, structured dataset of real-world AI risk incidents is crucial for empirical AI safety and governance research.
Principles
- AI risk tracking needs empirical data.
- News sources offer rich incident data.
- Structured data bridges governance gaps.
Method
RiskNet uses a pipeline: AI risk news identification, event-level report screening, incident alignment, and multi-dimensional incident classification from multilingual news.
In practice
- Use RiskNet for AI safety research.
- Explore incidents via online platform.
- Benchmark AI risk classification models.
Topics
- AI Risk Incidents
- AI Safety
- AI Governance
- RiskNet Dataset
- Machine Learning
- Dataset Annotation
Best for: Research Scientist, AI Scientist, AI Ethicist, Policy Maker
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.