QCon AI Boston 2026 Schedule: Agents in Production, Inference Cost, and AI in the SDLC
Summary
The full schedule for QCon AI Boston 2026, running June 1-2 at Boston University, has been released. The two-day program focuses on engineering challenges encountered after initial AI demonstrations, specifically addressing the transition of AI agents into production, managing inference costs, ensuring auditability of non-deterministic systems, and re-evaluating software development processes with AI integration. Key themes include "Context engineering for agents," featuring sessions from LinkedIn and Redis on adapting agents to internal services and building production-grade AI beyond prompting. "Inference economics and infrastructure" covers topics like KV cache optimization for LLM serving at scale, scaling AI agent infrastructure with Ray, and a market analysis of AI infrastructure bottlenecks. "Reliability, evaluation, and safety" includes discussions on building AI-powered safety systems at DoorDash, adaptive recommenders at Netflix, reusable evaluation frameworks for agentic AI products, and zero-trust agent systems at Broadcom. Finally, "AI inside the developer workflow" explores AI's impact on the SDLC, including a case study from Red Hat and a keynote on AI maturity in engineering organizations.
Key takeaway
For Machine Learning Engineers tasked with deploying AI solutions, understanding the QCon AI Boston 2026 schedule highlights critical areas for production readiness. Focus on context engineering for agents, optimizing inference economics, and integrating robust evaluation and safety frameworks into your development lifecycle. Your ability to move beyond prototypes to scalable, auditable, and cost-effective AI systems will be crucial for organizational AI maturity.
Key insights
QCon AI Boston 2026 addresses the practical engineering challenges of deploying AI agents and models into production.
Principles
- Production AI requires context engineering.
- Inference cost dictates AI architecture.
- AI safety is an engineering task.
Method
LinkedIn uses a Model Context Protocol (MCP) to adapt coding agents to internal services, while Redis focuses on data and retrieval context for reliable LLM outputs beyond prompt iteration.
In practice
- Optimize KV cache for LLM inference cost.
- Scale AI agent infrastructure using Ray.
- Implement reusable evaluation frameworks for agents.
Topics
- AI Agents in Production
- Context Engineering
- Inference Economics
- LLM Infrastructure
- AI System Reliability
Best for: Machine Learning Engineer, AI Engineer, MLOps Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by InfoQ.