Building a Movie Recommendation System (Ke-Netflix)

2026-06-23 · Source: Machine Learning on Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Data Science & Analytics · Depth: Intermediate, medium

Summary

Ke-Netflix is a robust, automated hybrid movie recommendation system developed from the MovieLens dataset, initially comprising 9,000 movies, 610 users, and 100,000 ratings. The system features a dedicated text-cleaning engine for movie titles, integrates the TMDB API for data enrichment and new movie discovery, and simulates ongoing user activity to create a dynamic dataset. It employs two primary recommendation algorithms: content-based filtering using feature vectors and collaborative filtering via Singular Value Decomposition, achieving an RMSE of 0.99 and Precision@10 of 15%. These are combined into a 60% collaborative, 40% content-based hybrid score. Hosted on PostgreSQL with Neon, the entire pipeline is automated using GitHub Actions for weekly movie synchronization and daily/weekly recommendation refreshes. A Streamlit application provides a personalized, explainable user interface, showcasing 9,900+ movies and 18,300+ recommendations.

Key takeaway

For MLOps Engineers designing or scaling recommendation systems, prioritize robust data engineering and full automation over isolated model training. Your system must handle messy data, grow its catalog, and refresh recommendations without manual intervention. Implement idempotent data cleaning, integrate external APIs for enrichment, and simulate user behavior to create dynamic datasets. Automate end-to-end pipelines using tools like GitHub Actions to ensure reliability and continuous operation, providing explainable recommendations through a user-friendly interface.

Key insights

Building production-ready recommendation systems requires robust data engineering and automation beyond basic model training.

Principles

Data cleaning and schema design precede ML.
Idempotent pipelines enable reliable automation.
Hybrid algorithms improve recommendation quality.

Method

Develop a text-cleaning engine, simulate user behavior, implement content-based and collaborative filtering, then automate with scheduled workflows.

In practice

Use Unicode normalization and regex for messy text.
Integrate external APIs (e.g., TMDB) for data enrichment.
Precompute similarities for faster content-based recommendations.

Topics

Recommendation Systems
Data Engineering
MLOps
Streamlit
Collaborative Filtering
Content-Based Filtering
GitHub Actions

Code references

kene1123/KE-Netflix-Recommendation-System-v1

Best for: Machine Learning Engineer, AI Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning on Medium.