Automated Machine Learning — A Paradigm Shift That Accelerates Data Scientist Productivity @ Airbnb

2017-05-11 · Source: Hamel Husain's Blog · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Intermediate, medium

Summary

Airbnb's data science team, facing repetitive tasks in machine learning workflows, has adopted Automated Machine Learning (AML) to enhance productivity. AML automates crucial steps such as exploratory data analysis, feature transformations, algorithm selection, hyper-parameter tuning, and model diagnostics. While not a complete replacement for data scientists due to the need for domain knowledge, AML significantly boosts productivity, particularly for regression and classification problems with tabular datasets. Airbnb has successfully applied AML for benchmarking challenger models, detecting target leakage, and generating canonical diagnostics. They have experimented with tools like TPOT, Auto-Sklearn, Auto-Weka, Machine-JS, and DataRobot. A case study on customer lifetime value (LTV) models demonstrated that AML helped reduce model error by over 5% by identifying competitive linear models and exploring feature engineering steps and hyper-parameter tuning that manual efforts missed.

Key takeaway

For AI Engineers building and deploying machine learning models, integrating Automated Machine Learning (AML) into your workflow can dramatically improve efficiency and model accuracy. You should consider using AML platforms as a "good modeling hygiene" practice, especially for tabular regression and classification problems, to quickly benchmark models, uncover hidden biases, and explore a broader range of algorithms and hyper-parameter tunings than manual efforts allow. This can lead to significant reductions in model error and faster iteration cycles.

Key insights

Automated Machine Learning (AML) significantly boosts data scientist productivity by automating repetitive ML workflow tasks.

Principles

AML excels in tabular regression/classification.
AML aids in unbiased model benchmarking.
Human judgment remains crucial for problem setup.

Method

AML frameworks automate exploratory data analysis, feature engineering, algorithm selection, hyper-parameter tuning, and model diagnostics to accelerate model development and improve accuracy.

In practice

Use AML for competitive model benchmarking.
Employ AML to detect data leakage early.
Generate canonical diagnostics automatically.

Topics

Automated Machine Learning
Data Scientist Productivity
Customer Lifetime Value
Model Benchmarking
Hyperparameter Tuning

Code references

Best for: AI Engineer, Data Scientist, Machine Learning Engineer, MLOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Hamel Husain's Blog.