I Built TinySafe, a Safety Model that Beats 8B Guard Models with 71M Parameters for $37

· Source: NLP on Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing, Data Science & Analytics · Depth: Advanced, medium

Summary

TinySafe v1, a 71M parameter safety model, has been developed to address the limitations of existing large guard models (slow, GPU-dependent) and small encoder models (inaccurate). Built on DeBERTa-v3-xsmall, TinySafe v1 features a dual-head classifier for binary safe/unsafe detection and 7-way category classification, achieving inference speeds under 2ms on CPU. It outperforms LlamaGuard 3-8B, LlamaGuard 4-12B, and ShieldGemma-27B on ToxicChat, and nearly matches WildGuard-7B and GPT-4 on WildGuardBench. The model was trained using a cost-effective data pipeline involving Claude's Batch API for consistent relabeling of seven public safety datasets, costing approximately $37 in total for data generation and GPU training.

Key takeaway

For AI Engineers building safety filters, TinySafe v1 demonstrates that high-performance, low-latency safety classification is achievable with significantly smaller models and minimal infrastructure investment. Your teams can deploy this 71M parameter model on CPU for sub-2ms inference, drastically reducing operational costs compared to larger guard models. Consider adopting a similar architecture and data pipeline to develop custom safety solutions that balance accuracy, speed, and cost-efficiency.

Key insights

A compact 71M parameter safety model outperforms larger guard models in accuracy and speed at minimal cost.

Principles

Method

TinySafe v1 uses DeBERTa-v3-xsmall with dual classification heads (binary and 7-way category) and focal loss. Data is relabeled via Claude's Batch API for consistency, followed by quality filtering and training with early stopping.

In practice

Topics

Code references

Best for: Machine Learning Engineer, AI Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by NLP on Medium.