One Month Into Learning Data Engineering in Public: Here’s What I Didn’t Write About
Summary
The author details their first month learning data engineering, deviating from a planned 12-month self-study roadmap. Instead of a sequential approach (SQL, Python, Git, Spark, Airflow, Databricks), they adopted an iterative method, continuously building and refining a small ETL pipeline. This process revealed that technical hurdles like idempotency, persistence, and portability often masked deeper conceptual "thinking lessons" about assumptions and environmental dependencies. The author also grappled with waning motivation and "shiny object syndrome," nearly pivoting to AI engineering, before a LinkedIn contact reaffirmed their original roadmap. A key learning was the effectiveness of completing small, focused projects over starting large, potentially unfinished ones for sustained learning and skill acquisition.
Key takeaway
For aspiring Data Engineers transitioning from other roles, your self-study roadmap should be a flexible guide, not a rigid contract. Focus on building small, iterative projects to uncover practical challenges and conceptual lessons, rather than strictly following a sequential curriculum. Publicly documenting your journey can provide crucial accountability and unexpected external validation, helping you stay on track when motivation wanes or "shiny object syndrome" strikes.
Key insights
Learning data engineering effectively involves iterative building, revealing deeper conceptual lessons beyond just tools.
Principles
- Plans are starting points, not contracts.
- Technical walls often hide thinking lessons.
- Small, finished projects drive learning.
Method
The author's method involved iteratively building a small ETL pipeline, pushing it until it broke, and learning tools (SQL, Python, Git) as demanded by the pipeline's evolving needs, rather than following a strict sequential curriculum.
In practice
- Build small, complete mini-projects.
- Document your learning journey publicly.
- Distrust "it worked once" assumptions.
Topics
- Data Engineering
- Self-Study Roadmap
- ETL Pipelines
- Learning Strategies
- Career Transition
- Public Learning
Best for: Data Engineer, Data Analyst, Software Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Towards Data Science.