Fixing Unsupervised Hyperbolic Contrastive Loss [D]
Summary
A user implementing Unsupervised Hyperbolic Contrastive Loss on the ImageNet-1k dataset reported that their hyperbolic model achieved only 57% 1-NN accuracy, significantly underperforming a simple Euclidean contrastive loss which reached 64%. The user provided Python code for their `hb_contrastive_loss` function, which utilizes `expmap()` and `projx()` to ensure embeddings reside on the Lorentzian manifold, and specified a temperature parameter of 0.07, a batch size of 2048, and a learning rate of 1e-4. The core issue appears to be the low temperature parameter, which is unsuitable for the larger distance scales inherent in hyperbolic space.
Key takeaway
For machine learning engineers working with hyperbolic contrastive loss, ensure your temperature parameter is significantly higher than what you would use for Euclidean space, potentially in the 0.5 to 1.0 range, to account for the larger distance scales. Additionally, verify that manifold projection steps are applied not just during the forward pass but also after every gradient update to maintain embedding validity. Experiment with learning rate and batch size sweeps to further optimize performance.
Key insights
Hyperbolic contrastive loss requires a higher temperature parameter than Euclidean loss due to different distance scales.
Principles
- Hyperbolic distances are generally larger than Euclidean distances.
- Projection to the manifold is crucial after every gradient update.
In practice
- Increase temperature parameter for hyperbolic loss (e.g., 0.5-1.0).
- Ensure manifold projection after each gradient update.
- Sweep learning rate and batch size for optimization.
Topics
- Unsupervised Hyperbolic Contrastive Loss
- ImageNet-1k
- Lorentzian Manifold
- Contrastive Learning
- Hyperparameter Tuning
Best for: Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.