[P] SoftDTW-CUDA for PyTorch package: fast + memory-efficient Soft Dynamic Time Warping with CUDA support
Summary
The SoftDTW-CUDA for PyTorch package offers a GPU-accelerated and memory-efficient implementation of Soft Dynamic Time Warping (SoftDTW) for PyTorch. SoftDTW, introduced by Cuturi & Blondel (2017), is a differentiable alignment loss function for time series data. This new implementation addresses common practical limitations of existing SoftDTW versions, such as speed, memory consumption, and sequence length constraints. Benchmarks show it is approximately 67 times faster than the Maghoumi-style CUDA/Numba implementation and uses about 98% less GPU memory through fused distance computation. It also removes the N ≤ 1024 sequence length limitation by supporting N > 1024 via tiled anti-diagonal execution and provides numerically stable backward passes using log-space gradients. The package also includes SoftDTW barycenters for averaging in DTW space, making it suitable for applications like representation learning, metric learning, sequence-to-sequence matching, and forecasting.
Key takeaway
For AI Engineers and Research Scientists working with time series data and differentiable alignment, SoftDTW-CUDA significantly improves the practical applicability of SoftDTW. You can now process longer sequences and larger batches with substantially reduced GPU memory and faster computation, enabling more robust representation learning and forecasting models without resorting to CPU fallbacks or compromising on sequence length.
Key insights
SoftDTW-CUDA provides a fast, memory-efficient, and scalable SoftDTW implementation for PyTorch.
Principles
- Fused distance computation reduces GPU memory.
- Tiled anti-diagonal execution enables longer sequences.
- Log-space gradients ensure numerical stability.
Method
The implementation uses Numba CUDA kernels with full PyTorch autograd integration, employing fused distance computation and tiled anti-diagonal execution for efficiency and scalability.
In practice
- Use as a differentiable loss for time series alignment.
- Apply for forecasting tasks.
- Compute DTW-space barycenters for temporal prototypes.
Topics
- Soft Dynamic Time Warping
- PyTorch
- GPU Acceleration
- Time Series Analysis
- Differentiable Alignment
Code references
- BGU-CS-VIL/sdtw-cuda-torch
- Sleepwalking/pytorch-softdtw
- Maghoumi/pytorch-softdtw-cuda
- keonlee9420/Soft-DTW-Loss
Best for: AI Engineer, AI Scientist, Research Scientist, Machine Learning Engineer, Deep Learning Engineer, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.