If Rotating Secrets Requires a Ticket, It’s Not a Process — It’s a Problem
Summary
An audit of an MLOps platform built on Kubernetes revealed significant issues in access management, despite the system appearing stable externally. The core problem identified was a lack of control over identities, tokens, and secrets, rather than just rotation. The audit uncovered user-level service accounts like "john-smith" and "daniel-wilson" residing in the `kube-system` namespace, some existing for hundreds of days, with one account remaining active over a year after the employee's departure. These service accounts were tied to long-lived, unrotated tokens, often with `cluster-admin` permissions, and were not linked to any observable lifecycle. Furthermore, numerous `Opaque` secrets, such as `airflow-creds` and `grafana-creds`, also exhibited long lifespans and lacked clear ownership, with some duplicated across namespaces. The analysis concluded that the system had outgrown its ad-hoc access model, leading to critical operability and security risks.
Key takeaway
For CTOs and VPs of Engineering overseeing MLOps platforms, recognize that ad-hoc access management practices, while pragmatic at small scale, become critical constraints as systems grow. Your platform's apparent stability can mask deep operability and security vulnerabilities stemming from ungoverned identities, long-lived tokens, and unmanaged secrets. Prioritize establishing a clear, governable access model, centralizing identity management, and enforcing short-lived, workload-specific credentials to prevent silent failures and ensure safe, predictable system changes.
Key insights
Uncontrolled access in MLOps platforms, often due to ad-hoc practices, creates systemic operability and security risks.
Principles
- Access control is about governance, not just rotation.
- Service accounts should represent workloads, not human identities.
- Access is systemic, not local, in MLOps environments.
Method
Audit Kubernetes access by first examining service accounts (`kubectl get sa -A`), then tokens (`kubectl get secrets -A | grep token`), and finally permissions (`kubectl get clusterrolebinding | grep cluster-admin`) and secrets (`kubectl get secrets -A`).
In practice
- Use `kubectl get sa -A` to map existing identities.
- Check `kubernetes.io/service-account-token` ages for long-lived tokens.
- Audit `Opaque` secrets for ownership and rotation.
Topics
- Kubernetes Access Control
- MLOps Security
- Secrets Management
- Service Account Lifecycle
- Kubeconfig Security
Best for: CTO, VP of Engineering/Data, MLOps Engineer, AI Architect, Director of AI/ML
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by HackerNoon.