Mission-Critical Generative AI in Action • Scott Shaw • YOW! 2025
Summary
Commonwealth Bank, ranked fourth globally in the Evident AI Index, details its approach to deploying mission-critical generative AI applications. The bank has been putting GenAI into production for 18 months, supporting personalized customer experiences, faster service (e.g., Ceba chat app), fraud protection, and empowering staff with AI-powered engineering tools, processing 80 billion tokens weekly through its gateway for internal use. The presentation highlights common challenges preventing GenAI prototypes from reaching production, including unpredictable performance, high inference costs, guardrail and throttling issues (429 errors), governance hurdles, and rapid model deprecation (models often end-of-life within a year). To address these, CommBank proposes a "minimal generative AI platform" comprising three pillars: a Gateway for consistent access, control, and data collection; Guardrails tuned for regulated environments; and an Evaluation platform for continuous data processing, labeling, and performance assessment.
Key takeaway
For MLOps Engineers and AI Architects deploying generative AI, recognize that traditional software engineering practices are insufficient. You must proactively design for finite capacity, non-determinism, and rapid model deprecation by implementing a robust GenAI platform. Focus on Evaluation Driven Development and a "smallest, cheapest model first" strategy to ensure cost-effectiveness and maintainability. Your role is critical in integrating human supervision and mathematical literacy to safely scale GenAI applications.
Key insights
Production-ready GenAI demands a structured platform and adapted engineering practices to manage inherent unpredictability and rapid model churn.
Principles
- Prioritize smallest, cheapest models first.
- Embrace Evaluation Driven Development.
- Anticipate finite capacity and code for retries.
Method
Implement a GenAI platform with a Gateway for model access/control, Guardrails for safety/compliance, and an Evaluation system for continuous performance assessment and data feedback.
In practice
- Use a gateway for load balancing and model swapping.
- Tune guardrails for specific application needs.
- Collect production data for continuous model improvement.
Topics
- Generative AI Production
- MLOps
- AI Governance
- Model Evaluation
- LLM Gateways
- Responsible AI
Best for: AI Architect, CTO, VP of Engineering/Data, MLOps Engineer, AI Engineer, Software Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by GOTO Conferences.