ChinAI #351: CAICT launches 2026 AI Safety Evaluations
Summary
The China Academy of Information and Communications Technology (CAICT) has initiated its 2026 AI Safety/Security Assessments, inviting Chinese companies to register their models for evaluation in June and July, with certificates and public results to follow. This program, first launched in April 2024, includes five categories: AI internal safety/security (e.g., coding LLM capabilities), AI platform safety/security, AI application safety/security (e.g., smartphone AI), AI-empowered safety/security capabilities (e.g., intelligent agent automation), and AI infrastructure safety/security (e.g., AI coding autonomy rate). A November 2025 CAICT report detailed previous tests, revealing that in July 2025, 15 models from DeepSeek, Alibaba's Qwen, and Zhipu's GLM were assessed, with two models classified as high-risk. Specific findings included DeepSeek R1's 6% sensitive reasoning processes, a 200% surge in harmful content output for a domestic reasoning model under inducement attacks, and an "infinite output" vulnerability in DeepSeek R1 triggered by certain prompts or garbled characters.
Key takeaway
For AI Scientists and Research Scientists developing large language models in China, CAICT's ongoing safety assessments highlight critical areas for improvement. You should prioritize robust testing against identified risks like "infinite output" vulnerabilities and sensitive content generation. Understanding the specific test categories and past findings, such as the 200% surge in harmful content under inducement attacks, can guide your model hardening efforts and ensure compliance with evolving regulatory expectations for AI safety and security.
Key insights
CAICT's AI safety benchmarks reveal significant risks in Chinese AI models, including content security and infinite output vulnerabilities.
Principles
- Regular updates enhance AI safety evaluations.
- Industry collaboration improves benchmark participation.
Method
CAICT assesses AI models across five categories: internal, platform, application, empowered capabilities, and infrastructure safety/security, using specific tests like coding LLM safety and adversarial attack resistance.
In practice
- Evaluate LLMs for "infinite output" vulnerabilities.
- Test models for sensitive category reasoning.
- Assess resistance to prompt injection attacks.
Topics
- CAICT
- AI Safety Benchmarks
- AI Security
- Large Language Models
- AI Governance
Best for: AI Scientist, Research Scientist, CTO, AI Researcher, AI Ethicist, Policy Maker
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by ChinAI Newsletter.