Efficient Sensor Fusion for Gesture Recognition on Resource-Constrained Devices
Summary
A new gesture recognition system for smart eyewear leverages low-resolution Time-of-Flight (ToF) and Infrared (IR) thermal sensors to overcome the power, latency, and privacy issues of vision-based methods. The system uses an 8x8 multizone ToF sensor (VL53L8CH) and an 8x8 IR array (AMG8833) to capture depth and thermal data. A compact Convolutional Neural Network (CNN) with a grouped-convolution architecture efficiently fuses these modalities on a microcontroller (MCU). Tested on a custom dataset of 7 static gestures, the fusion strategy achieved 92.3% accuracy and a macro F1-score of 0.93, outperforming single-sensor baselines. On-device benchmarks on STM32F4 and STM32H7 MCUs show the system requires only 6,343 parameters, achieves millisecond-level inference latency, and consumes a total system power of 50 mW, making it suitable for resource-constrained wearables.
Key takeaway
For AI Engineers developing gesture recognition for smart eyewear or other resource-constrained wearables, consider integrating low-resolution ToF and IR thermal sensors. This approach offers a privacy-preserving solution with high accuracy and low power consumption, enabling robust human-computer interaction without the drawbacks of traditional vision-based systems. Your designs can achieve millisecond-level inference latency on common MCUs like STM32F4 and STM32H7.
Key insights
Sensor fusion of low-resolution ToF and IR enables efficient, private gesture recognition on resource-constrained devices.
Principles
- Multimodal sensor fusion enhances accuracy over single-sensor baselines.
- Grouped convolutions optimize CNNs for embedded systems.
Method
A compact CNN with grouped-convolution architecture fuses 8x8 ToF and 8x8 IR sensor data for gesture classification on microcontrollers, achieving high accuracy with minimal parameters and power.
In practice
- Utilize ToF and IR sensors for privacy-preserving HCI.
- Implement grouped convolutions for efficient on-device CNNs.
Topics
- Gesture Recognition
- Sensor Fusion
- Resource-Constrained Devices
- Time-of-Flight Sensors
- Infrared Thermal Sensors
Code references
Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Hardware Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.