Learning Earthquake Wave Arrival Time Picking from Labels with Inaccuracies
Summary
A new Label Noise-Contrastive Robust Learning (LaNCoR) approach has been developed to mitigate the impact of inaccurately labeled training data, or "label noise," in supervised machine learning models. This method is particularly relevant for seismic signal processing tasks, where traditional solutions like large-scale training sets or data augmentation are labor-intensive and expensive. LaNCoR operates by aligning the input waveform feature and label representation distributions within the feature space, effectively correcting mislabeling and reducing its detrimental effects during the training process. Evaluated on the task of P-phase arrival-time picking using real microseismic data and two baseline models, LaNCoR demonstrated significant performance improvements, achieving up to 28.8% better results across various metrics. This approach offers a promising alternative for robust model training in seismology and geosciences.
Key takeaway
For Machine Learning Engineers developing models for seismological applications, especially when dealing with noisy labeled data, you should consider integrating the LaNCoR approach. This method offers a robust alternative to costly large-scale datasets or data augmentation, directly improving model generalization and accuracy. Implementing LaNCoR can significantly enhance P-phase arrival-time picking performance, potentially by up to 28.8%, without extensive data curation efforts. Evaluate its impact on your specific microseismic datasets to validate its benefits.
Key insights
LaNCoR effectively mitigates label noise in seismic data by aligning feature and label representation distributions.
Principles
- Inaccurate labels significantly degrade supervised ML model performance.
- Traditional label noise reduction methods are labor-intensive and costly.
Method
The Label Noise-Contrastive Robust Learning (LaNCoR) approach aligns input waveform feature and label representation distributions in feature space to correct mislabeling during training.
In practice
- Apply LaNCoR to P-phase arrival-time picking tasks.
- Expect up to 28.8% performance improvement in seismic data processing.
Topics
- Label Noise
- Seismic Signal Processing
- P-phase Picking
- Robust Learning
- Geophysics
- Machine Learning
Best for: AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.