A time-series classification framework for individual-level absenteeism prediction under severe class imbalance

2026-06-30 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, quick

Summary

A new Time Series Classification (TSC) framework addresses the challenge of individual-level absenteeism prediction, which incurs significant operational costs in sectors like healthcare. Existing methods are limited by mapping features to same-time labels and discarding sequential attendance history. This framework proactively predicts future absences by separating historical attendance sequences from future absence labels. Researchers constructed a reproducible simulated dataset, calibrated to the UCI dataset, due to the absence of public longitudinal data. The analysis evaluated Binary Focal Loss (BFL) and Geometric Mean (G-Mean) loss under severe class imbalance, finding BFL achieved specificity 0.813 and balanced accuracy 0.888, comparable to G-Mean, which adapts automatically without parameter calibration. Among deep learning architectures, the hybrid LSTM-Fully Convolutional Network (LSTM-FCN) delivered strong precision and specificity. Stable performance, with approximately 80% balanced accuracy on held-out test data, was achieved using batch sizes >= 64 and window sizes between 40-80 days.

Key takeaway

For workforce planners and ML engineers tasked with improving absenteeism prediction in high-demand environments, this Time Series Classification framework offers a robust approach. You should consider implementing a TSC model, specifically the LSTM-FCN architecture, to leverage historical attendance sequences for genuinely proactive forecasts. Employing G-Mean loss can simplify handling severe class imbalance, as it adapts automatically. Optimize your model with batch sizes >= 64 and window sizes between 40-80 days to achieve stable, high balanced accuracy, significantly enhancing workforce planning capabilities.

Key insights

A Time Series Classification framework proactively predicts individual absenteeism by analyzing historical attendance sequences, outperforming traditional methods.

Principles

Separate historical sequences from future labels for proactive prediction.
G-Mean loss adapts automatically to severe class imbalance.
LSTM-FCN delivers strong precision and specificity for TSC.

Method

The framework constructs a simulated dataset, analyzes BFL and G-Mean loss under severe class imbalance, and evaluates deep learning architectures (LSTM, CNN, LSTM-FCN) for optimal performance.

In practice

Use LSTM-FCN for robust time-series classification tasks.
Consider G-Mean loss for imbalanced datasets without calibration.
Optimize batch sizes >= 64 and window sizes 40-80 days.

Topics

Time Series Classification
Absenteeism Prediction
Class Imbalance
LSTM-FCN
Binary Focal Loss
Geometric Mean Loss

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.