Building a Cluster-Aware AI Agent with Kubernetes, Argo CD, and GitOps

· Source: Cloud Native Computing Foundation · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure · Depth: Intermediate, medium

Summary

The article describes building a self-hosted, read-only AI agent inside a Kubernetes cluster using GitOps principles. This agent leverages Ollama to serve a local Mistral 7B LLM and a FastAPI application for its API and chat UI. A CI/CD pipeline, orchestrated by GitHub Actions for image builds and Argo CD Image Updater for committing new tags to Git, ensures continuous deployment via Argo CD. The agent observes live cluster state through a dedicated read-only Kubernetes `ServiceAccount` and `ClusterRole`, enabling Retrieval-Augmented Generation (RAG) for actionable diagnostics without data leaving the cluster. This design provides auditable behavior through `git log` and demonstrates two modes: a general `/ask` endpoint and a cluster-aware `/diagnose` endpoint.

Key takeaway

For DevOps or Platform Engineers evaluating AI agent integration, this self-hosted, read-only Kubernetes agent provides a concrete starting point. By deploying a local LLM with Ollama and integrating it with live cluster state via read-only RBAC, you can safely experiment with grounded diagnostics. This approach allows you to understand the full agent loop and its GitOps-driven CI/CD without external data egress, enabling secure, auditable, and practical AI adoption within your infrastructure.

Key insights

A self-hosted, read-only AI agent within Kubernetes leverages local LLMs and GitOps for cluster-aware diagnostics, ensuring data privacy and auditable operations.

Principles

Method

Deploy Ollama and FastAPI pods with a read-only `ServiceAccount`. Use GitHub Actions for image builds, Argo CD Image Updater to commit new tags to Git, and Argo CD for cluster reconciliation.

In practice

Topics

Code references

Best for: MLOps Engineer, DevOps Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Cloud Native Computing Foundation.