Stereo Vision-Based Fall Prediction and Detection using Human Pose Estimation on the AMD Kria K26 SOM

2026-06-10 · Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision & Pattern Recognition, Robotics & Autonomous Systems · Depth: Advanced, quick

Summary

A new stereo vision-based system for fall prediction and detection leverages human pose estimation (HPE) on an AMD Kria K26 System-on-Module (SOM). This portable, low-power, battery-operated solution aims to provide non-intrusive, privacy-preserving real-time fall detection for elderly monitoring. The system integrates an Intel RealSense D455 camera, capturing synchronized RGB and depth frames at 640 x 480 pixels and 60 FPS. Its three-stage pipeline utilizes quantized YOLOX for human bounding box detection, discarding RGB frames for privacy, followed by Anchor-to-Joint (A2J) for 15 joint keypoint estimation from depth frames, and a CNN for fall activity classification using joint coordinates. Evaluated accuracies were 74% for YOLOX, 84.13% for A2J, and 75.85% for the CNN. Throughput improved from 2.5 FPS with a single-threaded DPU to 4.5 FPS using a multi-threaded dual-core DPU. The system demonstrates the feasibility of cloud-independent, on-device fall detection.

Key takeaway

For Computer Vision Engineers developing assistive healthcare solutions, this work demonstrates a viable path for on-device, privacy-preserving fall detection. You should consider integrating stereo depth cameras and multi-stage quantized models like YOLOX and A2J on edge SOMs, such as the AMD Kria K26, to achieve real-time performance without cloud dependency. Prioritize discarding RGB data early to enhance user privacy in sensitive applications.

Key insights

Privacy-preserving, real-time fall detection is feasible on edge devices using stereo vision and quantized human pose estimation.

Principles

Edge-based processing enhances privacy.
Depth data enables robust pose estimation.
Quantized models improve embedded performance.

Method

The system uses an Intel RealSense D455 camera, a three-stage pipeline (quantized YOLOX, A2J, CNN) on an AMD Kria K26 SOM. YOLOX detects bounding boxes, A2J estimates 15 joint keypoints from depth, and a CNN classifies fall activity.

In practice

Use Intel RealSense D455 for depth sensing.
Quantize models for edge device deployment.
Discard RGB frames after bounding box detection.

Topics

Stereo Vision
Fall Detection
Human Pose Estimation
AMD Kria K26 SOM
Edge AI
Model Quantization
Privacy Preservation

Best for: AI Scientist, Research Scientist, Computer Vision Engineer, AI Hardware Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.