Semantic Fencing of Video Streams Using Embedding Splits from Vision Foundation Models

2026-05-15 · Source: AMD ROCm Blogs · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Robotics & Autonomous Systems · Depth: Advanced, extended

Summary

AMD presents "Semantic Fencing," a novel method for splitting vision datasets into training, validation, and test sets using embeddings from vision foundation models like CLIP and DINO. This approach, implemented as the open-source tool BubbleFence, addresses the limitations of traditional metadata-based or random splitting, which often lead to "semantic leakage" and inflated performance metrics in real-world, high-volume data streams such as autonomous driving logs. BubbleFence constructs semantically meaningful splits by defining bounded regions ("bubbles") in the latent embedding space, centered on data-derived anchors. It supports incremental dataset growth, automatically assigning new frames to existing bubbles and placing new anchors only as needed to maintain target evaluation ratios, as demonstrated on the Zenseact Open Dataset and Minecraft gameplay videos. The method is designed to be domain-agnostic, relying on learned visual representations rather than ad hoc, domain-specific heuristics.

Key takeaway

For AI Engineers managing continuous, high-volume visual data streams, adopting BubbleFence can significantly improve model evaluation reliability. By leveraging vision foundation model embeddings to create semantically fenced dataset splits, you can mitigate "semantic leakage" and ensure your reported performance metrics accurately reflect generalization to truly novel data. Integrate BubbleFence into your MLOps pipeline to enable stable, incrementally growing split structures that adapt as new content appears, reducing manual effort and enhancing the credibility of your model assessments.

Key insights

Semantic Fencing uses vision model embeddings to create robust, semantically-grounded dataset splits, preventing leakage.

Principles

Embeddings capture high-level visual meaning.
Latent space geometry reflects semantic similarity.
Data streams exhibit temporal, spatial, semantic correlations.

Method

BubbleFence maps images to embedding vectors, removes near-duplicates, places QMC anchors, constructs adaptive bubbles with nested evaluation shells, and persists state for incremental data ingestion.

In practice

Use BubbleFence for robust train/val/test splits.
Apply to autonomous driving or video game datasets.
Configure via YAML for pipeline behavior tuning.

Topics

BubbleFence
Semantic Dataset Splitting
Vision Foundation Models
Latent Space Embeddings
Streaming Data Curation

Code references

Best for: Machine Learning Engineer, AI Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AMD ROCm Blogs.