Qwen 3.7 Max & Plus First Test | Coding, Game Dev, OCR, Image Understanding Testing
Summary
The Qwen team has released preview versions of its Qwen 3.7 Max and Qwen 3.7 Plus large language models, which appear to be iterative improvements over Qwen 3.5 and 3.6. Qwen 3.7 Max, a text-only model, features a 262K token context window and a 65K-281K token maximum output. Qwen 3.7 Plus supports a 1 million token context window, 65K output, and handles text, image, and video, but lacks audio support, which is reserved for Omni versions. Both models are currently accessible via the Qwen chat website. Testing revealed Qwen 3.7 Max performed well on SVG creation and front-end design tasks, generating a "buff penguin" SVG and a functional resume website. Qwen 3.7 Plus successfully generated a recipe from a complex image of a refrigerator's contents and created an animated tutorial, though it failed to extract specific data from a PDF table.
Key takeaway
For AI engineers evaluating new large language models for deployment, the Qwen 3.7 Plus model, if open-sourced as anticipated, presents a strong candidate due to its multimodal capabilities (text, image, video) and improved performance on creative tasks. Consider its 1 million token context window for applications requiring extensive input, but be aware of potential limitations in precise data extraction from complex documents like PDFs.
Key insights
Qwen 3.7 Max and Plus models show improved performance and reduced 'overthinking' compared to previous iterations.
Principles
- Max models are typically closed-source.
- Plus models often support multimodal inputs.
Method
Models were evaluated using practical tasks: car wash logic, SVG generation, front-end code creation, 3D game simulation, recipe generation from image, animated tutorial creation, and PDF data extraction.
In practice
- Use Qwen 3.7 Max for text-only code generation.
- Utilize Qwen 3.7 Plus for multimodal recipe creation.
- Expect Qwen 3.7 Plus for image/video processing.
Topics
- Qwen 3.7 Max
- Qwen 3.7 Plus
- Large Language Models
- Multimodal AI
- SVG Generation
Best for: NLP Engineer, Computer Vision Engineer, Research Scientist, AI Engineer, Machine Learning Engineer, AI Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Venelin Valkov.