Designing Data-Intensive Applications: The Cloud & Doing the Right Thing
Summary
Martin Kleppmann and Chris Riccomini have released a second edition of "Designing Data-Intensive Applications," updating the 2016 original to reflect the widespread adoption of cloud computing. The refreshed volume, published by O'Reilly Media, Inc. in 2026, includes new content on cloud versus self-hosting tradeoffs, cloud-native system architecture, and the evolving role of operations in the cloud era. It also dedicates a significant portion to the ethical responsibilities of software engineers, addressing issues like predictive analytics, algorithmic bias and discrimination, accountability, feedback loops, and mass surveillance. The book aims to provide foundational principles for building resilient systems, incorporating topics such as AI-driven systems, local-first software, formal methods, and regulatory contexts like GDPR.
Key takeaway
For software engineers designing data-intensive applications, you should critically evaluate the tradeoffs between cloud services and self-hosting based on your team's expertise and workload variability. Beyond technical choices, actively engage with the ethical implications of your systems, particularly concerning predictive analytics and data collection, to ensure accountability and prevent unintended societal harms. Your decisions shape not just technology, but also human experiences.
Key insights
The updated "Designing Data-Intensive Applications" balances technical system design with critical ethical considerations for engineers.
Principles
- Core competencies should be in-house; non-core tasks can be outsourced.
- Cloud services excel with variable loads; self-hosting suits predictable, experienced teams.
- Engineers bear responsibility for software's societal consequences.
Method
When deciding between cloud and self-hosting, evaluate organizational skills, workload predictability, and the system's core competency alignment. For ethical design, engage in iterative reflection and accountability, considering system-wide impacts.
In practice
- Evaluate cloud services for systems with highly variable loads.
- Consider self-hosting for predictable workloads if operational expertise exists.
- Analyze potential feedback loops and biases in predictive analytics.
Topics
- Cloud Computing Tradeoffs
- Data-Intensive Systems
- Cloud-Native Architecture
- Software Engineering Ethics
- Algorithmic Bias
Best for: Software Engineer, AI Architect, MLOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by The Pragmatic Engineer.