Data Science Meets Devops: MLOps with Jupyter, Git, & Kubernetes

· Source: Hamel Husain's Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Cloud Computing & IT Infrastructure · Depth: Advanced, long

Summary

Kubeflow maintainers developed a CI/CD solution for machine learning models to manage a high influx of GitHub issues, leveraging a custom Issue Label Bot. Their initial model, trained on public repositories, predicted generic labels. They then used Google AutoML to train a Kubeflow-specific model, achieving an average precision of 72% and recall of 50% for specific labels like "area-jupyter" (0.9 precision, 0.7 recall) and "area-katib" (0.8 precision, 1.0 recall). The CI/CD pipeline, illustrated in Figure 2, uses independent controllers instead of a Directed Acyclic Graph (DAG) for preprocessing, training, validation, and deployment. This system integrates Jupyter notebooks for model development, GitOps for CI/CD, and Kubernetes with managed cloud services for infrastructure, ensuring continuous retraining and deployment of the issue labeling model.

Key takeaway

For MLOps Engineers building resilient and scalable ML CI/CD pipelines, consider adopting a reconciler-based, GitOps approach over traditional DAGs. This method, exemplified by Kubeflow's issue bot, enhances system resilience and team autonomy by allowing independent tool choices and declarative state management. You can start by adapting the provided Dockerfile and kustomize package to deploy your own ModelSync controller and defining custom lambdas and Tekton pipelines.

Key insights

Independent, reconciler-based controllers offer a resilient alternative to DAGs for ML CI/CD.

Principles

Method

The solution uses two independent Kubernetes controllers: a Trainer that periodically checks model freshness and initiates retraining, and a Deployer that ensures the correct model is deployed, both driven by GitOps.

In practice

Topics

Code references

Best for: MLOps Engineer, DevOps Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Hamel Husain's Blog.