Testing VLMs and LLMs for robotics w/ the Jetson Thor devkit
Summary
The Jetson Thor devkit, priced at approximately $3,000, features 128 GB of video memory but a relatively low memory bandwidth of 273 GB/s, which is slow for GPU standards but fast for CPU/RAM. Designed primarily for robotics, its core advantage is an extremely low power draw of 130 watts at max power, making it ideal for battery-powered applications. The devkit's actual processing board is a small T5000-like component, with the rest being heat dissipation. While not suitable for model training, it excels at inference, particularly for local LLMs and Vision Language Models (VLMs) like Moon Dream 2. Its high memory capacity, despite bandwidth limitations, enables techniques like pipeline parallelism to significantly improve VLM performance, achieving 30 FPS with 15 parallel servers for object detection.
Key takeaway
For robotics engineers developing on-device AI, the Jetson Thor devkit presents a compelling option. Its 128 GB of memory and 130-watt power draw are crucial for deploying complex LLMs and VLMs directly on robots, even with its 273 GB/s memory bandwidth. You should prioritize inference workloads and explore parallel processing techniques to maximize throughput for real-time applications like object tracking and visual querying.
Key insights
The Jetson Thor devkit offers high memory and low power for robotics inference, despite limited memory bandwidth.
Principles
- High memory capacity can compensate for limited bandwidth in specific workloads.
- Pipeline parallelism improves throughput by distributing tasks across multiple model instances.
- Power efficiency is critical for embedded robotics applications.
Method
To optimize VLM performance on bandwidth-limited hardware, run multiple instances of the model in parallel, distributing incoming frames across them to increase effective FPS while maintaining individual latency.
In practice
- Utilize the Jetson Thor for local LLM and VLM inference on robots.
- Implement pipeline parallelism for VLM tasks to boost frame rates.
- Consider the 130W power draw for battery-constrained robotic systems.
Topics
- Jetson Thor Devkit
- Memory Bandwidth
- Local LLMs
- Vision Language Models
- Pipeline Parallelism
Best for: AI Engineer, Machine Learning Engineer, Robotics Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by sentdex.