ARIADNE: Agnostic Routing for Inference-time Adapter DyNamic sElection

2026-06-17 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

ARIADNE is a novel, training-free, and adapter-agnostic routing framework designed for dynamic adapter selection during inference. It addresses the challenge of automatically choosing the most appropriate task-specialized adapter from a growing pool when input queries lack explicit task labels. ARIADNE operates by representing each adapter through a set of centroids, derived from embeddings of its training data, which capture the adapter's associated data distribution. During inference, it selects an adapter by measuring the proximity of an unlabeled input's embedding to these centroids in latent space. Evaluated with Llama 3.2 1B Instruct across 23 diverse NLP tasks, ARIADNE recovers 97.44% of the upper bound performance. When scaled to 44 tasks, it achieves an 89.7% average selection accuracy without requiring additional training or access to adapter internals.

Key takeaway

For Machine Learning Engineers deploying PEFT models with numerous task-specialized adapters, ARIADNE offers a scalable, training-free solution for dynamic inference-time selection. You can achieve high selection accuracy (e.g., 89.7% across 44 tasks) without modifying existing adapters or incurring additional router training costs, streamlining your model management. This approach ensures efficient resource utilization and maintains performance as your adapter ecosystem grows.

Key insights

ARIADNE enables training-free, adapter-agnostic selection by matching input embeddings to adapter-specific centroids in latent space.

Principles

Adapter selection can be training-free.
Input embedding space routing is PEFT-agnostic.
Centroids capture adapter data distribution.

Method

ARIADNE computes centroids from training set embeddings for each adapter. At inference, it selects an adapter by measuring input embedding proximity to these centroids in latent space.

In practice

Integrate with Llama 3.2 1B Instruct for NLP tasks.
Scale adapter pools without retraining routers.
Use with arbitrary PEFT methods.

Topics

Adapter Routing
PEFT
Inference Optimization
Llama 3.2
NLP Tasks
Training-Free Methods

Best for: AI Architect, AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.