TRE Python binding — ReDoS robustness demo

· Source: Simon Willison's Weblog · Field: Technology & Digital — Software Development & Engineering, Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Advanced, quick

Summary

A new project demonstrates the TRE regular expression library's robustness against Regular Expression Denial-of-Service (ReDoS) attacks, contrasting its performance with Python's built-in `re` module. This project provides a minimal Python `ctypes` binding for TRE, showcasing its linear scaling with input size, unlike the exponential scaling of `re`. Benchmarks reveal TRE efficiently processes "evil" regex patterns on inputs up to 10 million characters, significantly outperforming `re` on much smaller inputs. The library's resilience is attributed to its design, specifically its lack of support for backtracking, a common vulnerability in other regex engines.

Key takeaway

For Python developers building applications that process untrusted user input or complex regex patterns, you should consider integrating the TRE library. Its demonstrated immunity to ReDoS attacks and linear performance scaling offer a significant security and stability advantage over Python's default `re` module, preventing potential denial-of-service vulnerabilities in your systems.

Key insights

TRE regex library offers ReDoS immunity and linear scaling due to its non-backtracking design.

Principles

Method

The project uses Python's `ctypes` to create a minimal binding to the TRE library, then benchmarks its performance against malicious regex patterns.

In practice

Topics

Code references

Best for: Machine Learning Engineer, NLP Engineer, CTO, Software Engineer, AI Engineer, AI Security Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Simon Willison's Weblog.