If Rotating Secrets Requires a Ticket, It’s Not a Process — It’s a Problem

· Source: HackerNoon · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure, Software Development & Engineering · Depth: Intermediate, long

Summary

An audit of an MLOps platform built on Kubernetes revealed significant issues in access management, despite the system appearing stable externally. The core problem identified was a lack of control over identities, tokens, and secrets, rather than just rotation. The audit uncovered user-level service accounts like "john-smith" and "daniel-wilson" residing in the `kube-system` namespace, some existing for hundreds of days, with one account remaining active over a year after the employee's departure. These service accounts were tied to long-lived, unrotated tokens, often with `cluster-admin` permissions, and were not linked to any observable lifecycle. Furthermore, numerous `Opaque` secrets, such as `airflow-creds` and `grafana-creds`, also exhibited long lifespans and lacked clear ownership, with some duplicated across namespaces. The analysis concluded that the system had outgrown its ad-hoc access model, leading to critical operability and security risks.

Key takeaway

For CTOs and VPs of Engineering overseeing MLOps platforms, recognize that ad-hoc access management practices, while pragmatic at small scale, become critical constraints as systems grow. Your platform's apparent stability can mask deep operability and security vulnerabilities stemming from ungoverned identities, long-lived tokens, and unmanaged secrets. Prioritize establishing a clear, governable access model, centralizing identity management, and enforcing short-lived, workload-specific credentials to prevent silent failures and ensure safe, predictable system changes.

Key insights

Uncontrolled access in MLOps platforms, often due to ad-hoc practices, creates systemic operability and security risks.

Principles

Method

Audit Kubernetes access by first examining service accounts (`kubectl get sa -A`), then tokens (`kubectl get secrets -A | grep token`), and finally permissions (`kubectl get clusterrolebinding | grep cluster-admin`) and secrets (`kubectl get secrets -A`).

In practice

Topics

Best for: CTO, VP of Engineering/Data, MLOps Engineer, AI Architect, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by HackerNoon.