How to Call Rust from Python
Summary
This article demonstrates how to integrate Rust code with Python to accelerate performance-critical sections, particularly non-vectorizable loops. It introduces Maturin, a build and packaging tool that compiles Rust code into Python modules using PyO3, enabling seamless interoperability. The author provides practical examples, starting with a "Hello World" equivalent and progressing to a text normalization task involving lowercasing, punctuation removal, and tokenization. Benchmarks show that a single-threaded Rust implementation processes 500,000 texts in 3.024 seconds (165,343 texts/s), significantly faster than Python's 5.159 seconds (96,919 texts/s). Further, integrating Rust's Rayon library for parallelism reduces processing time to 2.223 seconds (224,914 texts/s), achieving over twice the speed of pure Python with minimal code changes.
Key takeaway
For Software Engineers and Data Scientists optimizing Python applications, consider offloading non-vectorizable, CPU-intensive loops to Rust using Maturin and PyO3. This approach allows you to retain Python's rich ecosystem and development convenience for most of your project while gaining Rust's speed, memory safety, and concurrency benefits where performance is critical. Benchmarking your hot loops will identify prime candidates for this hybrid optimization strategy.
Key insights
Integrating Rust with Python via Maturin and PyO3 can significantly accelerate performance-critical, non-vectorizable code sections.
Principles
- Python's ecosystem offers broad utility, while Rust provides predictable performance.
- Small Rust code sections can yield substantial Python performance gains.
- Parallelization with Rayon further enhances Rust-Python hybrid performance.
Method
Use Maturin to compile Rust code (with PyO3) into a Python-importable shared library. Define Rust functions with `#[pyfunction]` and register them in a `#[pymodule]` to expose them to Python. For parallelism, replace sequential iterators with `par_iter()` from the Rayon library.
In practice
- Use Maturin for building and packaging Rust-based Python extensions.
- Offload CPU-bound string processing loops to Rust.
- Employ Rayon for easy parallelization of Rust code.
Topics
- Rust-Python Integration
- Maturin Build Tool
- PyO3
- Performance Optimization
- Parallel Processing
Best for: Software Engineer, Machine Learning Engineer, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Towards Data Science.