Precision Physical Activity Prescription via Reinforcement Learning for Functional Actions

2026-06-24 · Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Digital Health · Depth: Expert, extended

Summary

This paper introduces a novel offline reinforcement learning (RL) algorithm designed to generate precision physical activity (PA) prescriptions. Addressing the limitation of scalar PA summaries, the method learns personalized, optimal daily step distributions over 90-day periods to improve cardiometabolic health biomarkers. Utilizing longitudinal Fitbit step counts and health data from 205 participants in the All of Us Research Program, the algorithm extends the fitted-Q iteration to functional actions, employing penalized splines for policy smoothness. Simulation studies demonstrate its superiority over continuous-action RL. The resulting policy recommends individuals generally increase daily steps to around 10,000 and maintain a more consistent PA pattern, with tailored adjustments for subgroups based on blood glucose, BMI, blood pressure, age, and sex.

Key takeaway

For Data Scientists developing personalized health interventions, this research indicates that moving beyond scalar physical activity summaries to functional distributions, optimized via offline reinforcement learning, can yield significantly more precise and effective prescriptions. You should explore modeling daily step patterns as functions, not just averages, to tailor recommendations based on individual cardiometabolic profiles, age, and sex. This approach enables nuanced guidance, like increasing moderate-to-high activity steps for specific subgroups, potentially improving adherence and health outcomes.

Key insights

Offline reinforcement learning can optimize personalized physical activity prescriptions represented as functional distributions of daily steps.

Principles

Represent physical activity as distributions, not scalar summaries.
Functional actions in RL model complex, continuous behavioral patterns.
Derive optimal policies from observational health data using offline RL.

Method

Extends Fitted Q-Evaluation and Fitted Q-Iteration for functional actions, using penalized splines for policy smoothness. Actions are represented via log-quantile-density transformation.

In practice

Aim for ~10,000 daily steps with consistent patterns.
Increase moderate-to-high activity steps for normal glucose levels.
Obese individuals (BMI ≥ 30.0) should increase steps across all activity periods.

Topics

Reinforcement Learning
Physical Activity Prescription
Functional Policy Learning
All of Us Research Program
Cardiometabolic Risk
Wearable Device Data

Code references

gefeilin/Functional-Fitted-Q-Learning

Best for: AI Scientist, Research Scientist, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.