Testing VLMs and LLMs for robotics w/ the Jetson Thor devkit

2025-08-30 · Source: sentdex · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Software Development & Engineering · Depth: Intermediate, extended

Summary

The Jetson Thor devkit, priced at approximately $3,000, features 128 GB of video memory but a relatively low memory bandwidth of 273 GB/s, which is slow for GPU standards but fast for CPU/RAM. Designed primarily for robotics, its core advantage is an extremely low power draw of 130 watts at max power, making it ideal for battery-powered applications. The devkit's actual processing board is a small T5000-like component, with the rest being heat dissipation. While not suitable for model training, it excels at inference, particularly for local LLMs and Vision Language Models (VLMs) like Moon Dream 2. Its high memory capacity, despite bandwidth limitations, enables techniques like pipeline parallelism to significantly improve VLM performance, achieving 30 FPS with 15 parallel servers for object detection.

Key takeaway

For robotics engineers developing on-device AI, the Jetson Thor devkit presents a compelling option. Its 128 GB of memory and 130-watt power draw are crucial for deploying complex LLMs and VLMs directly on robots, even with its 273 GB/s memory bandwidth. You should prioritize inference workloads and explore parallel processing techniques to maximize throughput for real-time applications like object tracking and visual querying.

Key insights

The Jetson Thor devkit offers high memory and low power for robotics inference, despite limited memory bandwidth.

Principles

High memory capacity can compensate for limited bandwidth in specific workloads.
Pipeline parallelism improves throughput by distributing tasks across multiple model instances.
Power efficiency is critical for embedded robotics applications.

Method

To optimize VLM performance on bandwidth-limited hardware, run multiple instances of the model in parallel, distributing incoming frames across them to increase effective FPS while maintaining individual latency.

In practice

Utilize the Jetson Thor for local LLM and VLM inference on robots.
Implement pipeline parallelism for VLM tasks to boost frame rates.
Consider the 130W power draw for battery-constrained robotic systems.

Topics

Jetson Thor Devkit
Memory Bandwidth
Local LLMs
Vision Language Models
Pipeline Parallelism

Best for: AI Engineer, Machine Learning Engineer, Robotics Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by sentdex.