Kaggle Solution Walkthroughs: LEAP - Atmospheric Physics using AI (ClimSim) with TeamZ Lab 数据实验室

2026-02-25 · Source: Kaggle · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Software Development & Engineering · Depth: Intermediate, medium

Summary

A data science competition team from China, with multiple Kaggle medals, details their winning solution for a multi-label regression task. Their approach involved extensive data processing, utilizing 75 million samples from low-resolution datasets (years 1-8) for training and 0.8 million samples for validation. Key to their success was a training method that initially trained on all 386 labels, then fine-tuned the model seven times, once for each of seven distinct label groups, achieving a score improvement of at least 0.001. They employed a cosine scheduler with restart and early stopping (9 epochs, two periods of 3 and 6 epochs) and optimized with Smooth L1 loss and an auxiliary difference loss. Their models primarily leveraged LSTM architectures, outperforming CNNs and Transformers, and incorporated residual connections for faster convergence and better performance. Ensemble methods, including hill climbing with negative weights, combined diverse LSTM-based models, some with convolutional encoders or MemNN blocks, to achieve top scores.

Key takeaway

For data scientists and ML engineers tackling multi-label regression problems, consider adopting a staged training approach. First, train a comprehensive model, then fine-tune it specifically for distinct groups of related labels. This strategy, combined with robust LSTM architectures and a cosine learning rate scheduler with restarts, can significantly boost model performance and help escape local minima, as demonstrated by a 0.001 score improvement in a competitive setting.

Key insights

Group-wise fine-tuning and LSTM-based architectures significantly enhance multi-label regression performance.

Principles

Fine-tuning by label groups improves multi-label regression.
LSTMs can outperform CNNs/Transformers in specific tasks.
Residual connections accelerate model convergence.

Method

Train a multi-label model on all targets, then fine-tune it iteratively for specific label groups, adjusting the loss to focus on one group at a time.

In practice

Use Smooth L1 loss for stable multi-label regression.
Implement an auxiliary difference loss for related labels.
Employ cosine scheduler with restarts for training.

Topics

Group Fine-tuning
LSTM Architectures
Ensemble Learning
Loss Functions
Kaggle Competition

Best for: Machine Learning Engineer, Data Scientist, AI Student

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Kaggle.