Podcast: Engineering Stable, Secure and Scalable Platforms: A Conversation with Matthew Liste
Summary
Matthew Liste, an infrastructure engineering veteran from American Express and J.P. Morgan Chase, discusses the critical aspects of building and managing stable, secure, and scalable software platforms in a podcast with Michael Stiefel. The conversation, recorded on April 20, 2026, emphasizes that platform services are the foundation for application development and must consistently uphold these "three S's." Liste highlights the inherent difficulty in scaling systems due to unforeseen resource contention and advocates for using customer journeys to identify system vulnerabilities. The discussion also explores the dual impact of artificial intelligence, noting its potential to accelerate development and increase risk, while simultaneously hindering the apprenticeship of junior engineers by automating basic coding tasks. Liste stresses the importance of managing limited resources and making strategic trade-offs in platform functionality.
Key takeaway
For AI Architects and MLOps Engineers designing core infrastructure, recognize that agentic AI accelerates both development and operational risks. You must prioritize building observability and monitoring platforms that can process data at speeds comparable to AI-generated changes, effectively creating an "arms race" where AI combats AI-induced threats. This necessitates scaling existing telemetry systems to handle exponentially faster data consumption by agentic systems, ensuring the "three S's" (stability, security, scalability) are maintained in a rapidly evolving environment.
Key insights
Platform engineering prioritizes stability, security, and scalability, using customer journeys to manage risk and adapt to AI's accelerating impact.
Principles
- Platform services must always be stable, secure, and scalable.
- Scaling systems is difficult due to unknown resource contention.
- Customer journeys effectively measure system reliability and functionality.
Method
Identify system risks by mapping customer journeys, then anticipate scale and learn from failures to engineer out recurring issues, balancing risk appetite with business needs.
In practice
- Measure system reliability using customer journeys like "Can I pay with my credit card?".
- Anticipate scale for new systems, especially for high-volume events like Black Friday.
- Implement agentic AI for faster observability to match AI-driven development speed.
Topics
- Platform Engineering
- System Scalability
- Agentic AI
- Software Apprenticeship
- Risk Management
Best for: AI Architect, MLOps Engineer, DevOps Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by InfoQ.