ARIADNE: Agnostic Routing for Inference-time Adapter DyNamic sElection
Summary
ARIADNE is a novel, training-free, and adapter-agnostic routing framework designed for dynamic adapter selection during inference. It addresses the challenge of automatically choosing the most appropriate task-specialized adapter from a growing pool when input queries lack explicit task labels. ARIADNE operates by representing each adapter through a set of centroids, derived from embeddings of its training data, which capture the adapter's associated data distribution. During inference, it selects an adapter by measuring the proximity of an unlabeled input's embedding to these centroids in latent space. Evaluated with Llama 3.2 1B Instruct across 23 diverse NLP tasks, ARIADNE recovers 97.44% of the upper bound performance. When scaled to 44 tasks, it achieves an 89.7% average selection accuracy without requiring additional training or access to adapter internals.
Key takeaway
For Machine Learning Engineers deploying PEFT models with numerous task-specialized adapters, ARIADNE offers a scalable, training-free solution for dynamic inference-time selection. You can achieve high selection accuracy (e.g., 89.7% across 44 tasks) without modifying existing adapters or incurring additional router training costs, streamlining your model management. This approach ensures efficient resource utilization and maintains performance as your adapter ecosystem grows.
Key insights
ARIADNE enables training-free, adapter-agnostic selection by matching input embeddings to adapter-specific centroids in latent space.
Principles
- Adapter selection can be training-free.
- Input embedding space routing is PEFT-agnostic.
- Centroids capture adapter data distribution.
Method
ARIADNE computes centroids from training set embeddings for each adapter. At inference, it selects an adapter by measuring input embedding proximity to these centroids in latent space.
In practice
- Integrate with Llama 3.2 1B Instruct for NLP tasks.
- Scale adapter pools without retraining routers.
- Use with arbitrary PEFT methods.
Topics
- Adapter Routing
- PEFT
- Inference Optimization
- Llama 3.2
- NLP Tasks
- Training-Free Methods
Best for: AI Architect, AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.