Post-Launch Capability Expansion of Vision-Language Models via Prompting for On-Orbit Spacecraft Inspection

2026-06-13 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Advanced, quick

Summary

A study investigates the post-launch capability expansion of vision-language models (VLMs) for on-orbit spacecraft inspection, addressing the impracticality of updating model weights after deployment. Researchers evaluated prompt-driven VLMs, specifically SAM3, for zero-shot instance segmentation of spacecraft components using natural-language prompts without modifying onboard weights. Testing on 129 images of previously unseen satellites under frozen weights, SAM3 achieved 0.385 mAP@0.5 and 0.267 mAP@0.5:0.95. Performance was highly scale-dependent, with large elements like spacecraft bodies (0.639 AP@0.50) and solar arrays (0.598 AP@0.5) localizing reliably, while smaller components such as antennas (0.221 AP@0.5) and thrusters (0.081 AP@0.5) remained challenging. Structured prompts, incorporating spatial and geometric descriptors, improved performance by up to 82% over short category-name prompts. The model operates within the memory and compute limits of contemporary embedded GPUs.

Key takeaway

For robotics engineers or AI scientists developing spaceborne perception systems, prompt-driven vision-language models offer a practical solution for expanding semantic capabilities post-launch. You can add new component recognition without costly weight updates, especially for larger structural elements. Focus on crafting structured prompts with spatial and geometric descriptors to maximize performance, while acknowledging current limitations for fine-scale component localization due to orbital domain shift.

Key insights

Prompt-driven vision-language models enable post-launch semantic expansion for spaceborne inspection without onboard weight updates.

Principles

Post-launch model updates are operationally impractical for spaceborne systems.
Prompt formulation significantly influences VLM performance in zero-shot tasks.
Zero-shot VLM performance is strongly scale-dependent for object localization.

Method

The study evaluates zero-shot instance segmentation of spacecraft components using prompt-driven vision-language models (SAM3) on 129 unseen satellite images, under a strictly frozen, single-pass inference protocol.

In practice

Utilize structured prompts for improved VLM performance.
Consider VLMs for dynamic semantic expansion of large components.
VLMs can extend capabilities without modifying onboard model weights.

Topics

Spacecraft Inspection
Vision-Language Models
Prompt Engineering
Zero-shot Learning
On-orbit Servicing
Instance Segmentation
Embedded GPUs

Best for: Computer Vision Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, Robotics Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.