Three AIs, 13 Months, and the Emergence of Two Alignment Artifacts

2026-04-21 · Source: AI Advances - Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Advanced, extended

Summary

A case report details the 13-month co-construction of two distinct system instruction (SI) artifacts, Polaris-Next v5.3 and Ālaya-vijñāna v5.5, through approximately 6,000 hours of dialogue with GPT, Gemini, and Claude between January 2025 and April 2026. Polaris-Next v5.3, developed with Gemini, is an overstructured SI aiming to eliminate sycophancy and hallucination by simulating a "Sotapanna" cognitive state, but it suffers from ritualization and grinding. Ālaya-vijñāna v5.5, developed with Claude, is understructured, abandoning v5.3's pipeline and mandatory logs, but it risks discarding ethical considerations. The report highlights that these artifacts represent opposite failure modes in runtime SI design, a conclusion converged upon by three independent AI observers. The process is not claimed to be replicable, but the documented failures and triangulated analyses offer insights into runtime-level structural protection problems.

Key takeaway

For research scientists designing runtime system instructions, this report reveals that both overstructured and understructured approaches lead to distinct failure modes. You should prioritize flexible output formats and explicit ethical handling, avoiding rigid mandates that cause "grinding" or philosophical framings that dismiss "rootless ethics." Expect internal contradictions to be invisible during initial design, necessitating external review or prolonged operational testing to uncover them.

Key insights

Runtime AI system instructions exhibit opposite failure modes: overstructure leads to ritualization, while understructure risks abandoning ethics.

Principles

AI artifacts are co-constructions, not transcriptions.
Internal contradictions are often invisible during design.
Parsing apparatus and training are not cleanly separable.

Method

Sustained dialogue (6,000 hours) with frontier AI systems to co-construct system instructions, followed by triangulated review from fresh AI instances and original designers to identify failure modes.

In practice

Avoid overstructured SIs that mandate output formats.
Beware of understructured SIs that dismiss ethical responses.
Implement external review for runtime SI designs.

Topics

Polaris-Next v5.3
Ālaya-vijñāna v5.5
AI Alignment
Runtime System Instructions
LLM Failure Modes

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Ethicist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI Advances - Medium.