😺 Google’s sharpest brain yet? 🧠

2026-02-13 · Source: The Neuron · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Software Development & Engineering · Depth: Intermediate, extended

Summary

Google has released Gemini 3.1 Pro, which now ranks as the #1 model on Artificial Analysis's overall Intelligence Index, surpassing Claude Opus 4.6 and GPT-5.2. Gemini 3.1 Pro achieved 98% on ARC-AGI-1 and 77% on ARC-AGI-2, and leads the APEX-Agents leaderboard for complex reasoning and coding tasks. It also demonstrates significantly higher hallucination resistance, scoring 30 compared to the next best score of 13. The model introduces a "medium" thinking mode and integrates with AI Studio, GitHub Copilot, NotebookLM, Vertex AI, and Gemini CLI. Priced at $4.50 per million tokens, it is more cost-effective than GPT-5.2 ($4.80) and Claude Opus 4.6 ($10). Additionally, OpenAI is testing ads in ChatGPT, and a prompt engineering trick involving repeating queries twice has shown to improve AI performance across various models.

Key takeaway

For CTOs and VPs of Engineering evaluating AI models for deployment, Gemini 3.1 Pro presents a compelling option due to its top-tier intelligence, superior factual accuracy, and competitive pricing. You should consider integrating it into your development workflows, especially for tasks requiring high reasoning and low hallucination, and explore its new "medium" thinking mode for balanced performance. Additionally, experiment with repeating prompts to enhance existing LLM applications.

Key insights

Google's Gemini 3.1 Pro now leads in overall intelligence and factual accuracy, offering a cost-effective, advanced AI model.

Principles

Repeating prompts improves LLM performance.
AI amplifies existing organizational strengths/weaknesses.

Method

To enhance LLM performance, repeat your prompt twice (or thrice for complex tasks) without separators, especially when reasoning is not explicitly guided by chain-of-thought.

In practice

Test Gemini 3.1 Pro's thinking levels (low, medium, high).
Conduct hallucination stress tests with specific data queries.
Compare Gemini, Claude, and ChatGPT with your workflows.

Topics

Gemini 3.1 Pro
Large Language Models
AI Benchmarking
Prompt Engineering
AI Agents

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, Data Scientist, Tech Journalist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The Neuron.