A time-series classification framework for individual-level absenteeism prediction under severe class imbalance

arXiv CS.AI

이 뉴스, 어떠셨어요?

한 번의 탭으로 반응을 남겨요 · 로그인 불필요

CC BY

이 매체는 공공·자유 라이선스로 본문을 직접 표시합니다.

Abstract

Staff absenteeism imposes substantial operational costs in high-demand work environments such as healthcare, emergency services, meat processing, construction, and courier and delivery services, where proactive workforce planning depends on reliable individual-level absence prediction.

Existing regression and classification approaches share a structural limitation; they map features observed at time t to labels at the same time t, reproducing already-realised outcomes rather than predicting future events, and discard the sequential behavioural structure inherent in individual attendance histories.

We propose a Time Series Classification (TSC) framework that separates historical attendance sequences from future absence labels, enabling genuinely proactive prediction.

Due to the lack of public longitudinal attendance data, we construct a reproducible simulated dataset calibrated to the UCI dataset.

We analyse Binary Focal Loss (BFL) and Geometric Mean (G-Mean) loss under severe class imbalance using only the imbalance ratio $\rho$.

For BFL, the initial gradient ratio is $\rho\alpha/(1-\alpha)$, implying the balanced weight $\alpha = 1/(1+\rho) \approx 0.023$.

Experiments show that performance is governed mainly by $\alpha$, with BFL achieving specificity 0.813 and balanced accuracy 0.888, comparable to G-Mean.

Unlike BFL, G-Mean adapts automatically without parameter calibration.

Among three deep learning architectures evaluated, Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and the hybrid LSTM-Fully Convolutional Network (LSTM-FCN), the LSTM-FCN delivers strong precision and specificity.

Stable performance is obtained with batch sizes >= 64 and window sizes between 40-80 days, yielding balanced accuracy of approximately 80% on held-out test data.

전문 보기

A time-series classification framework for individual-level absenteeism prediction under severe class imbalance

이 뉴스, 어떠셨어요?

Abstract

관련 뉴스

'research' 카테고리 뉴스

What Drives Interactive Improvement from Feedback?

Contrastive Reflection for Iterative Prompt Optimization

How Can AI Find My Model? A Model-Finding Experimental Study Considering Data Formats, Embeddings, and Retrieval Strategies

arXiv의 다른 기사

Beyond expert users: agents should help users construct preferences, not just elicit them

Investigating Multi-Agent Deliberation in Law

Why Solve It Twice? Hierarchical Accumulation of Skills for Transfer-Efficient ML Engineering