Understanding dynamic resource allocation in Kubernetes
Summary
Kubernetes Dynamic Resource Allocation (DRA), now generally available in v1.35, offers a refined approach to managing hardware resources like GPUs. This post, published on July 1, 2026, details its implementation using NVIDIA's maturing dra-driver-nvidia-gpu in a CNTUG Infra Labs environment featuring Kubernetes v1.35.3, Containerd 2.2.2, NVIDIA RTX A5000, and Tesla T10 GPUs. It demonstrates how to install the NVIDIA GPU Operator v26.3.1 and NVIDIA DRA Driver GPU v25.12.0, then explores practical scenarios: sharing a single GPU across containers, prioritizing specific GPU models (e.g., A5000 over T10) in deployments, requesting GPUs based on memory capacity (e.g., >20GiB), and configuring GPU Time Slicing for shared access.
Key takeaway
For AI/ML engineers or DevOps teams managing GPU-intensive workloads on Kubernetes, DRA in v1.35 provides significantly more flexible and precise GPU allocation than the legacy Device Plugin. You can declaratively specify GPU types, memory, or sharing strategies, enabling better resource utilization. Consider migrating existing GPU deployments to DRA for improved management and scaling, but be mindful of how rolling updates interact with ResourceClaimTemplates during transitions.
Key insights
Kubernetes DRA provides granular, declarative GPU allocation, surpassing older Device Plugin limitations.
Principles
- DeviceClass categorizes available hardware.
- ResourceSlice tracks node-specific device pools.
- ResourceClaim/Template manage device requests.
Method
Install NVIDIA GPU Operator and DRA Driver, then define ResourceClaims or ResourceClaimTemplates using `exactly` or `firstAvailable` with CEL expressions for precise device selection.
In practice
- Share a single GPU among multiple containers.
- Prioritize specific GPU models (e.g., A5000).
- Request GPUs based on memory capacity (e.g., >20GiB).
Topics
- Kubernetes
- Dynamic Resource Allocation
- GPU Management
- NVIDIA GPU Operator
- ResourceClaim
- ResourceClaimTemplate
Code references
Best for: Machine Learning Engineer, AI Engineer, DevOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Cloud Native Computing Foundation.