Kaggle Solution Walkthroughs: Enefit - Predict Energy Behavior of Prosumers with Team 预测多了一点

· Source: Kaggle · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Intermediate, medium

Summary

A second-place team in a forecasting competition details their solution, which involved training three distinct models: a neural network (NN), LightGBM, and CatBoost. The team found that transforming the original target variable by calculating ratios with historical targets significantly boosted results by 1-2 points. Key features included ratios between direct and surface solar radiation, and historical electricity values and weather conditions. They employed online training, with total submission training taking approximately six hours on Kaggle kernels. Feature selection involved creating validation datasets to pick the top 50 features, many of which were ratios. The NN model ultimately performed best, leading to a weighted ensemble with 0.5 for NN and 0.25 each for LightGBM and CatBoost.

Key takeaway

For data scientists and ML engineers building forecasting models, consider transforming your target variable into a ratio with historical data, as this approach can yield significant performance gains. Prioritize online training for dynamic datasets to ensure models are always leveraging the latest information. Additionally, focus on creating ratio-based features from historical and environmental data, as these often prove highly impactful for predictive accuracy.

Key insights

Target variable transformation and online training significantly enhance forecasting model performance.

Principles

Method

Train separate models for consumption and production using online training. Transform target variables into ratios with historical data. Employ cross-validation (4-fold for NN, 6-fold for boosting models) and ensemble with weighted averaging based on public leaderboard scores.

In practice

Topics

Best for: Machine Learning Engineer, Data Scientist, AI Student

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Kaggle.