Stereo Vision-Based Fall Prediction and Detection using Human Pose Estimation on the AMD Kria K26 SOM

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision & Pattern Recognition, Robotics & Autonomous Systems · Depth: Advanced, quick

Summary

A new stereo vision-based system for fall prediction and detection leverages human pose estimation (HPE) on an AMD Kria K26 System-on-Module (SOM). This portable, low-power, battery-operated solution aims to provide non-intrusive, privacy-preserving real-time fall detection for elderly monitoring. The system integrates an Intel RealSense D455 camera, capturing synchronized RGB and depth frames at 640 x 480 pixels and 60 FPS. Its three-stage pipeline utilizes quantized YOLOX for human bounding box detection, discarding RGB frames for privacy, followed by Anchor-to-Joint (A2J) for 15 joint keypoint estimation from depth frames, and a CNN for fall activity classification using joint coordinates. Evaluated accuracies were 74% for YOLOX, 84.13% for A2J, and 75.85% for the CNN. Throughput improved from 2.5 FPS with a single-threaded DPU to 4.5 FPS using a multi-threaded dual-core DPU. The system demonstrates the feasibility of cloud-independent, on-device fall detection.

Key takeaway

For Computer Vision Engineers developing assistive healthcare solutions, this work demonstrates a viable path for on-device, privacy-preserving fall detection. You should consider integrating stereo depth cameras and multi-stage quantized models like YOLOX and A2J on edge SOMs, such as the AMD Kria K26, to achieve real-time performance without cloud dependency. Prioritize discarding RGB data early to enhance user privacy in sensitive applications.

Key insights

Privacy-preserving, real-time fall detection is feasible on edge devices using stereo vision and quantized human pose estimation.

Principles

Method

The system uses an Intel RealSense D455 camera, a three-stage pipeline (quantized YOLOX, A2J, CNN) on an AMD Kria K26 SOM. YOLOX detects bounding boxes, A2J estimates 15 joint keypoints from depth, and a CNN classifies fall activity.

In practice

Topics

Best for: AI Scientist, Research Scientist, Computer Vision Engineer, AI Hardware Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.