Efficient Solvers for SLOPE in R, Python, Julia, and C++

· Source: stat.ML updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Data Science & Analytics · Depth: Expert, extended

Summary

A new suite of packages in R, Python, Julia, and C++ has been released to efficiently solve the Sorted L-One Penalized Estimation (SLOPE) problem. These packages feature a highly efficient hybrid coordinate descent algorithm capable of fitting generalized linear models (GLMs) with various loss functions, including Gaussian, binomial, Poisson, and multinomial logistic regression. Designed for speed, memory efficiency, and flexibility, the implementation supports dense, sparse, and out-of-memory matrices, and efficiently handles full SLOPE path fitting and cross-validation, including relaxed SLOPE. Benchmarks on real and simulated data demonstrate that these packages consistently outperform existing SLOPE implementations in terms of speed, particularly in high-regularization regimes. An application to a metabolomics dataset showed SLOPE's utility in feature selection and interpretability, achieving an AUC of 0.978 compared to lasso's 0.94.

Key takeaway

For data scientists or machine learning engineers working with high-dimensional regression, you should consider adopting the new SLOPE packages for R, Python, or Julia. Their demonstrated superior speed and memory efficiency for fitting generalized linear models, especially for full regularization paths and cross-validation, can significantly accelerate your model development and improve feature interpretability through coefficient clustering. Explore its relaxed SLOPE functionality for bias mitigation.

Key insights

The new SLOPE packages offer superior performance and flexibility for regularized GLM estimation across multiple programming languages.

Principles

Method

The packages use a hybrid coordinate descent algorithm, alternating proximal gradient descent with coordinate descent on collapsed clusters, extended for GLMs via iteratively reweighted least-squares (IRLS). It incorporates screening rules and path fitting.

In practice

Topics

Code references

Best for: Research Scientist, AI Engineer, Machine Learning Engineer, AI Scientist, Data Scientist, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by stat.ML updates on arXiv.org.