not much happened today
Summary
Thinking Machines previewed "interaction models," trained from scratch for full-duplex multimodal interaction, enabling models to concurrently listen, speak, watch, think, search, and react. This represents a shift beyond turn-based AI, emphasizing continuous-time awareness and visual proactivity. Concurrently, OpenAI launched the OpenAI Deployment Company, a majority-owned unit with 150 Forward Deployed Engineers from the Tomoro acquisition, backed by $4B in initial investment, to help enterprises deploy frontier models. OpenAI also introduced Daybreak, an initiative for defensive cyber operations combining GPT-5.5, Codex, and specialized access tiers like Trusted Access for Cyber. Agent control planes are maturing, with tools like aggit, Claude agents terminal, and Cursor in Microsoft Teams emerging. New benchmarks like Artificial Analysis's Coding Agent Index are measuring model+harness combinations, revealing significant variations in cost, token usage, and time per task. Skepticism around TurboQuant is increasing, while local/open models continue to improve rapidly, with Qwen 3.6 35B A3B and DeepSeek V4 Flash showing strong performance on consumer hardware.
Key takeaway
For CTOs and VPs of Engineering evaluating AI integration strategies, recognize that the shift to natively interactive, multimodal AI is underway, demanding continuous processing capabilities rather than turn-based systems. Your teams should also consider the increasing maturity of local/open models like Qwen 3.6 35B A3B, which offer compelling performance and cost advantages for agentic workloads on consumer hardware, potentially displacing some hosted solutions within 12-24 months. Prioritize robust deployment and security frameworks, as demonstrated by OpenAI's Daybreak, to manage the operational risks of advanced AI.
Key insights
Native, full-duplex multimodal interaction and robust enterprise deployment are key next frontiers for AI.
Principles
- AI interaction should be continuous, not turn-based.
- Enterprise AI deployment requires dedicated engineering support.
- Local models improve faster than hardware ceilings.
Method
Thinking Machines trains "interaction models" from scratch for real-time, concurrent multimodal processing. OpenAI deploys models via a dedicated engineering unit and offers security-focused distribution tiers.
In practice
- Use Qwen 3.6 35B A3B for long-context local code refactoring.
- Employ SGLang for efficient multimodal model stacks.
- Benchmark agent systems as model+harness combinations.
Topics
- Multimodal AI Interaction
- OpenAI Enterprise Solutions
- AI Agent Orchestration
- Local LLM Inference
- AI Cybersecurity
Code references
- nathanlgabriel/paper_code_mapping_assessment
- Fringe210/llama.cpp-deepseek-v4-flash-cuda
- antirez/llama.cpp-deepseek-v4-flash
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Scientist, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AINews.