Podcast: Engineering Stable, Secure and Scalable Platforms: A Conversation with Matthew Liste

2026-04-20 · Source: InfoQ · Field: Technology & Digital — Software Development & Engineering, Cloud Computing & IT Infrastructure, Artificial Intelligence & Machine Learning · Depth: Advanced, extended

Summary

Matthew Liste, an infrastructure engineering veteran from American Express and J.P. Morgan Chase, discusses the critical aspects of building and managing stable, secure, and scalable software platforms in a podcast with Michael Stiefel. The conversation, recorded on April 20, 2026, emphasizes that platform services are the foundation for application development and must consistently uphold these "three S's." Liste highlights the inherent difficulty in scaling systems due to unforeseen resource contention and advocates for using customer journeys to identify system vulnerabilities. The discussion also explores the dual impact of artificial intelligence, noting its potential to accelerate development and increase risk, while simultaneously hindering the apprenticeship of junior engineers by automating basic coding tasks. Liste stresses the importance of managing limited resources and making strategic trade-offs in platform functionality.

Key takeaway

For AI Architects and MLOps Engineers designing core infrastructure, recognize that agentic AI accelerates both development and operational risks. You must prioritize building observability and monitoring platforms that can process data at speeds comparable to AI-generated changes, effectively creating an "arms race" where AI combats AI-induced threats. This necessitates scaling existing telemetry systems to handle exponentially faster data consumption by agentic systems, ensuring the "three S's" (stability, security, scalability) are maintained in a rapidly evolving environment.

Key insights

Platform engineering prioritizes stability, security, and scalability, using customer journeys to manage risk and adapt to AI's accelerating impact.

Principles

Platform services must always be stable, secure, and scalable.
Scaling systems is difficult due to unknown resource contention.
Customer journeys effectively measure system reliability and functionality.

Method

Identify system risks by mapping customer journeys, then anticipate scale and learn from failures to engineer out recurring issues, balancing risk appetite with business needs.

In practice

Measure system reliability using customer journeys like "Can I pay with my credit card?".
Anticipate scale for new systems, especially for high-volume events like Black Friday.
Implement agentic AI for faster observability to match AI-driven development speed.

Topics

Platform Engineering
System Scalability
Agentic AI
Software Apprenticeship
Risk Management

Best for: AI Architect, MLOps Engineer, DevOps Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by InfoQ.