😺 Google’s sharpest brain yet? 🧠
Summary
Google has released Gemini 3.1 Pro, which now ranks as the #1 model on Artificial Analysis's overall Intelligence Index, surpassing Claude Opus 4.6 and GPT-5.2. Gemini 3.1 Pro achieved 98% on ARC-AGI-1 and 77% on ARC-AGI-2, and leads the APEX-Agents leaderboard for complex reasoning and coding tasks. It also demonstrates significantly higher hallucination resistance, scoring 30 compared to the next best score of 13. The model introduces a "medium" thinking mode and integrates with AI Studio, GitHub Copilot, NotebookLM, Vertex AI, and Gemini CLI. Priced at $4.50 per million tokens, it is more cost-effective than GPT-5.2 ($4.80) and Claude Opus 4.6 ($10). Additionally, OpenAI is testing ads in ChatGPT, and a prompt engineering trick involving repeating queries twice has shown to improve AI performance across various models.
Key takeaway
For CTOs and VPs of Engineering evaluating AI models for deployment, Gemini 3.1 Pro presents a compelling option due to its top-tier intelligence, superior factual accuracy, and competitive pricing. You should consider integrating it into your development workflows, especially for tasks requiring high reasoning and low hallucination, and explore its new "medium" thinking mode for balanced performance. Additionally, experiment with repeating prompts to enhance existing LLM applications.
Key insights
Google's Gemini 3.1 Pro now leads in overall intelligence and factual accuracy, offering a cost-effective, advanced AI model.
Principles
- Repeating prompts improves LLM performance.
- AI amplifies existing organizational strengths/weaknesses.
Method
To enhance LLM performance, repeat your prompt twice (or thrice for complex tasks) without separators, especially when reasoning is not explicitly guided by chain-of-thought.
In practice
- Test Gemini 3.1 Pro's thinking levels (low, medium, high).
- Conduct hallucination stress tests with specific data queries.
- Compare Gemini, Claude, and ChatGPT with your workflows.
Topics
- Gemini 3.1 Pro
- Large Language Models
- AI Benchmarking
- Prompt Engineering
- AI Agents
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, Data Scientist, Tech Journalist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by The Neuron.