The Architecture Behind Atlas: OpenAI’s New ChatGPT-based Browser
Summary
OpenAI launched ChatGPT Atlas, a web browser designed for AI co-piloting, which required a novel architectural approach to achieve instant startup, responsiveness with hundreds of tabs, and rich animations. Instead of embedding Chromium directly, OpenAI developed OWL (OpenAI's Web Layer), an architecture that runs Chromium as a separate process. This separation allows the Atlas UI, built with SwiftUI and Metal, to communicate with the Chromium host via Mojo, using custom Swift and TypeScript bindings. OWL addresses complex challenges like cross-process rendering using macOS CALayerHost API for efficient GPU memory sharing and handles input events by translating NSEvents to WebInputEvents. This design facilitates features like Agent mode, which composites pop-up UI elements into single screenshots for AI input and uses isolated, in-memory storage partitions for ephemeral sessions, ensuring security and data privacy.
Key takeaway
For AI Architects and Software Engineers building complex applications with embedded web technologies, consider adopting a decoupled architecture like OpenAI's OWL. This approach allows for faster UI startup, enhanced stability through process isolation, and greater flexibility for integrating advanced features like AI agents, while minimizing the overhead of maintaining custom patches against upstream web engines. Your team can achieve higher developer productivity by abstracting the web engine into a prebuilt binary.
Key insights
Decoupling the browser UI from the web engine enables advanced AI-powered browsing features and developer agility.
Principles
- Prioritize developer velocity and rapid iteration.
- Isolate components for stability and security.
- Abstract complex systems with clean APIs.
Method
Run Chromium as a separate process (OWL Host) from the main application (OWL Client), communicating via Mojo IPC. Render content using CALayerHost and translate input events for cross-process display and interaction.
In practice
- Use process isolation for complex application components.
- Employ efficient graphics primitives for cross-process rendering.
- Implement ephemeral sessions for AI agent privacy.
Topics
- ChatGPT Atlas
- OWL Architecture
- Chromium
- Inter-Process Communication
- Agentic Browsing
Best for: Software Engineer, AI Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by ByteByteGo Newsletter.