Stereo Vision-Based Fall Prediction and Detection using Human Pose Estimation on the AMD Kria K26 SOM
Summary
A new stereo vision-based system for fall prediction and detection leverages human pose estimation (HPE) on an AMD Kria K26 System-on-Module (SOM). This portable, low-power, battery-operated solution aims to provide non-intrusive, privacy-preserving real-time fall detection for elderly monitoring. The system integrates an Intel RealSense D455 camera, capturing synchronized RGB and depth frames at 640 x 480 pixels and 60 FPS. Its three-stage pipeline utilizes quantized YOLOX for human bounding box detection, discarding RGB frames for privacy, followed by Anchor-to-Joint (A2J) for 15 joint keypoint estimation from depth frames, and a CNN for fall activity classification using joint coordinates. Evaluated accuracies were 74% for YOLOX, 84.13% for A2J, and 75.85% for the CNN. Throughput improved from 2.5 FPS with a single-threaded DPU to 4.5 FPS using a multi-threaded dual-core DPU. The system demonstrates the feasibility of cloud-independent, on-device fall detection.
Key takeaway
For Computer Vision Engineers developing assistive healthcare solutions, this work demonstrates a viable path for on-device, privacy-preserving fall detection. You should consider integrating stereo depth cameras and multi-stage quantized models like YOLOX and A2J on edge SOMs, such as the AMD Kria K26, to achieve real-time performance without cloud dependency. Prioritize discarding RGB data early to enhance user privacy in sensitive applications.
Key insights
Privacy-preserving, real-time fall detection is feasible on edge devices using stereo vision and quantized human pose estimation.
Principles
- Edge-based processing enhances privacy.
- Depth data enables robust pose estimation.
- Quantized models improve embedded performance.
Method
The system uses an Intel RealSense D455 camera, a three-stage pipeline (quantized YOLOX, A2J, CNN) on an AMD Kria K26 SOM. YOLOX detects bounding boxes, A2J estimates 15 joint keypoints from depth, and a CNN classifies fall activity.
In practice
- Use Intel RealSense D455 for depth sensing.
- Quantize models for edge device deployment.
- Discard RGB frames after bounding box detection.
Topics
- Stereo Vision
- Fall Detection
- Human Pose Estimation
- AMD Kria K26 SOM
- Edge AI
- Model Quantization
- Privacy Preservation
Best for: AI Scientist, Research Scientist, Computer Vision Engineer, AI Hardware Engineer, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.