Deep: New Multi-Modal AI features Explored

· Source: Department of Product · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Project & Product Management · Depth: Intermediate, quick

Summary

Pinterest's product teams are reportedly in a dispute with their CEO over the primary modality for the company's AI assistant, highlighting a broader challenge for product teams in 2026. While the CEO advocates for a voice-first approach to align with Gen Z expectations and create a "talking to a friend" shopping experience, designers and product leaders argue this could undermine Pinterest's core visual discovery value. This internal conflict underscores a significant shift in product design, where the interface layer is no longer fixed but a distinct decision across text, voice, image, video, or documents. The analysis will explore over 30 new multimodal AI features from companies like Google, Anthropic, DoorDash, and Lyft, and provide a framework for product teams to navigate these complex modality choices.

Key takeaway

For AI Product Managers designing new features, carefully consider the primary interaction modality. Your choice directly impacts user experience and product value, as seen in Pinterest's internal debate. Use a structured framework to evaluate whether voice, text, image, or a combination best serves your product's core purpose and user expectations, rather than adopting a modality simply for its novelty.

Key insights

Choosing the right AI modality is a critical product design decision that can make or break a product's core value.

Principles

Method

A five-test framework helps product teams decide on appropriate AI modalities by asking key questions about user interaction and product value proposition.

In practice

Topics

Best for: AI Product Manager, Product Manager, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Department of Product.