Enabling Predictive Maintenance in District Heating Substations: A Labelled Dataset and Fault Detection Evaluation Framework based on Service Data

2026-04-21 · Source: cs.AI updates on arXiv.org · Field: Energy & Utilities — Artificial Intelligence & Machine Learning, Utilities & Infrastructure, Energy Efficiency & Conservation · Depth: Advanced, extended

Summary

A new open-source framework and public dataset have been introduced to enable predictive maintenance in district heating substations (DHS). The framework combines the "PreDist" dataset, an evaluation method based on Accuracy, Reliability, and Earliness, and baseline results using the "EnergyFaultDetector" Python framework. The "PreDist" dataset contains 10-minute operational time series data from 93 DHS across two manufacturers, annotated with maintenance tasks, customer incident reports, and detailed fault metadata. The evaluation demonstrated that conditional autoencoder (AE) models within the "EnergyFaultDetector" achieved a normal-behavior accuracy of 0.98 and an eventwise F0.5 score of 0.83. These models detected 60% of faults before customer reports, with an average lead time of 3.9 days. The framework also supports root cause analysis using ARCANA, which was demonstrated through three use cases.

Key takeaway

For machine learning engineers developing predictive maintenance solutions for district heating systems, this framework provides a crucial public dataset and a robust evaluation methodology. You should leverage the "PreDist" dataset and the "EnergyFaultDetector" baselines to benchmark new fault detection and diagnosis (FDD) algorithms. Focusing on conditional autoencoder models and integrating root cause analysis tools like ARCANA will enhance both detection performance and operational interpretability, leading to more effective predictive maintenance strategies.

Key insights

A new public dataset and evaluation framework enable reproducible early fault detection in district heating substations.

Principles

Public datasets accelerate FDD development.
Operational utility drives metric selection.
Conditional AEs improve fault detection.

Method

The method involves training autoencoders on normal operational data, detecting anomalies via reconstruction error thresholds, and using a criticality counter. ARCANA provides post-hoc root cause analysis by identifying features contributing most to anomalies.

In practice

Use conditional AEs for seasonal data.
Prioritize faults with high monitoring potential.
Apply ARCANA for anomaly root cause analysis.

Topics

District Heating Substations
Predictive Maintenance
Fault Detection
Labelled Datasets
Autoencoders

Code references

AEFDI/EnergyFaultDetector

Best for: AI Engineer, AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.