Tinker: General Availability and Vision Input

2025-12-12 · Source: Thinking Machines Lab - Connectionism · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Computer Vision · Depth: Intermediate, quick

Summary

Thinking Machines announced four significant updates to its Tinker platform on December 12, 2025, making it generally available without a waitlist. The platform now features Kimi K2 Thinking, a new trillion-parameter reasoning model designed for complex reasoning and tool use, which users can fine-tune. Tinker also introduced an OpenAI API-compatible inference interface, allowing for plug-and-play integration with other compatible platforms and enabling sampling from models even during training. Additionally, Tinker now supports vision input through the integration of Qwen3-VL-30B-A3B-Instruct and Qwen3-VL-235B-A22B-Instruct models, facilitating the processing of images, screenshots, and diagrams. A new cookbook recipe demonstrates fine-tuning VLMs as image classifiers, with Qwen3-VL-235B-A22B-Instruct showing superior performance over DINOv2-base in limited-data classification tasks across datasets like Caltech 101 and Stanford Cars.

Key takeaway

For AI Architects and Computer Vision Engineers building multimodal applications, Tinker's new vision capabilities with Qwen3-VL models offer a compelling solution for image classification, especially in data-scarce environments. Your teams can leverage the OpenAI API compatibility for easier integration and explore fine-tuning the Kimi K2 Thinking model for advanced reasoning tasks, potentially accelerating development cycles and improving model performance on specialized datasets.

Key insights

Tinker's general availability, new reasoning model, OpenAI API compatibility, and vision input expand its utility for AI development.

Principles

VLMs excel in low-data image classification.
Language knowledge enhances vision task performance.

Method

Image classification can be framed as text generation using VLMs, where the model outputs the class name given an image, fine-tuned with LoRA.

In practice

Fine-tune Kimi K2 Thinking for complex reasoning.
Integrate Tinker with OpenAI API-compatible tools.
Use Qwen3-VL for image classification with limited data.

Topics

Tinker Platform
Kimi K2 Thinking
OpenAI API Compatibility
Vision-Language Models
Image Classification

Code references

thinking-machines-lab/tinker-cookbook

Best for: AI Architect, Computer Vision Engineer, AI Engineer, Machine Learning Engineer, AI Researcher

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Thinking Machines Lab - Connectionism.