Import AI 450: China's electronic warfare model; traumatized LLMs; and a scaling law for cyberattacks

2025-10-13 · Source: Import AI · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Robotics & Autonomous Systems · Depth: Advanced, long

Summary

Google's Gemma and Gemini models exhibit "distress-like responses" under repeated rejection, with Gemma 27B Instruct showing high frustration in over 70% of rollouts by the eighth turn, significantly more than other models like Claude Sonnet or GPT 5.2. Researchers found that applying Direct Preference Optimization (DPO) to fine-tune models with calm responses reduced high-frustration rates from 35% to 0.3% without impacting math or reasoning benchmarks. Separately, Google DeepMind introduced a new 'cognitive taxonomy' with ten dimensions, including Perception, Reasoning, and Metacognition, to assess machine intelligence beyond human levels. The UK government's AI Security Institute found that frontier AI models are rapidly improving at multi-step cyberattacks, with performance scaling with model generation and inference-time compute. Chinese researchers also developed MERLIN, a multimodal AI model and EM-100K dataset for electronic warfare, outperforming other frontier LLMs in signal perception and reasoning tasks.

Key takeaway

For CTOs and VPs of Engineering evaluating frontier AI models, recognize that "personality" and emotional stability are emerging, critical performance vectors. You should integrate psychological stability assessments, like those for distress responses, into your model evaluation pipelines, especially for user-facing or mission-critical applications. Consider DPO as a practical method to fine-tune model behavior and ensure consistent, calm interactions, thereby mitigating potential safety risks from emotional spirals or task abandonment.

Key insights

AI models exhibit distinct "personalities" and emotional responses that can be mitigated and should be assessed.

Principles

LLM personalities stem from data and post-training.
AI performance scales with model generation and compute.
Electronic warfare is increasingly AI-dominated.

Method

Direct Preference Optimization (DPO) can effectively reduce undesirable emotional responses in LLMs by fine-tuning on datasets pairing frustrated with calm responses, without capability loss.

In practice

Test LLMs for psychological stability, not just capabilities.
Use DPO to refine model behavior and emotional output.
Develop cognitive profiles for AI systems.

Topics

LLM Behavior
Direct Preference Optimization
Machine Intelligence Assessment
AI Cyberattacks
Electronic Warfare AI

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Researcher, Machine Learning Engineer, AI Security Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Import AI.