Safe reinforcement learning with online filtering for fatigue-predictive human-robot task planning and allocation in production

2026-04-14 · Source: Artificial Intelligence · Field: Manufacturing & Industrial — Automation & Robotics, Artificial Intelligence & Machine Learning, Manufacturing Operations & Management · Depth: Expert, quick

Summary

A new safe reinforcement learning approach, PF-CD3Q, has been developed to address dynamic human-robot task planning and allocation (HRTPA) in manufacturing, a key component of Industry 5.0. This method aims to maximize production efficiency while ensuring worker physical fatigue remains within safe limits. Unlike traditional HRTPA models that use static fatigue parameters, PF-CD3Q accounts for daily variations in human fatigue sensitivity by estimating fatigue-related parameters online. It integrates a particle filter (PF) for real-time tracking of human fatigue and updating fatigue model parameters. These estimators are then incorporated into a constrained dueling double deep Q-learning (CD3Q) framework, which predicts task-level fatigue and excludes tasks exceeding fatigue limits, effectively formulating the problem as a constrained Markov decision process (CMDP).

Key takeaway

For research scientists developing human-robot collaboration systems, PF-CD3Q offers a robust method to incorporate dynamic human fatigue into task planning. You should consider implementing online fatigue parameter estimation to enhance the adaptability and safety of your HRTPA models, moving beyond static assumptions to better reflect real-world worker conditions and improve overall system reliability.

Key insights

PF-CD3Q uses safe reinforcement learning and online fatigue estimation for dynamic human-robot task allocation.

Principles

Ergonomics enhances worker well-being.
Fatigue parameters vary daily.
Online estimation improves model accuracy.

Method

PF-CD3Q integrates particle filter (PF) estimators for real-time fatigue tracking and parameter updates into a constrained dueling double deep Q-learning (CD3Q) framework, predicting fatigue and constraining action space.

In practice

Apply online fatigue estimation.
Use constrained RL for safety.
Integrate PF with deep Q-learning.

Topics

Safe Reinforcement Learning
Human-Robot Collaboration
Task Planning and Allocation
Fatigue Prediction
Particle Filter

Best for: Research Scientist, Robotics Engineer, Machine Learning Engineer, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.