Vibe Coding with Kev: I Built an Outfit Recommender and All I Got Was Combat Boots With a Sundress

2026-02-13 · Source: Data Science on Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Software Development & Engineering · Depth: Intermediate, medium

Summary

An attempt to build an AI-powered outfit recommender system, expanding on a previous visual search project, encountered significant challenges despite individual components functioning correctly. The system aimed to identify individual garments from editorial lookbook photos using Meta's Segment Anything Model (SAM) and then match them to a product catalog using FashionCLIP for outfit recommendations. While SAM effectively segmented thousands of high-quality items from various brand lookbooks and FashionCLIP accurately matched similar products, the overall recommendation engine failed to generate coherent outfits. The core issue stemmed from the system's inability to distinguish between items merely co-occurring in a photo and items that are stylistically compatible, leading to illogical pairings like sundresses with combat boots. The project highlighted that domain knowledge, such as styling rules and occasion context, is crucial and cannot be derived solely from visual embeddings.

Key takeaway

For Machine Learning Engineers building recommendation systems, recognize that purely visual or co-occurrence-based approaches often fall short for tasks requiring nuanced domain knowledge. Your models may perform perfectly on individual components, but integrating human expertise or structured domain rules is critical to bridge the gap between technical functionality and practical, useful output. Consider how to encode "what goes with what" beyond simple visual similarity.

Key insights

Visual embeddings alone cannot capture complex domain knowledge like fashion compatibility for outfit recommendations.

Principles

Co-occurrence does not imply compatibility.
Domain knowledge is essential for practical AI applications.

Method

The system used Meta's SAM for image segmentation and FashionCLIP for visual matching, attempting to generate outfit recommendations by pairing segmented items from lookbook images with catalog products.

In practice

Use SAM for high-quality image segmentation.
Apply FashionCLIP for visual similarity search.

Topics

Outfit Recommender Systems
Visual Search
Segment Anything Model
FashionCLIP
AI Domain Expertise

Best for: Machine Learning Engineer, Data Scientist, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Data Science on Medium.