Your Data, Your Lake: How Observe Uses Iceberg and Streaming ETL for Observability

· Source: Data Engineering Podcast · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Cloud Computing & IT Infrastructure · Depth: Advanced, extended

Summary

Observe cofounder and CTO Jacob Leverich discusses applying lakehouse architectures to observability workloads, emphasizing cloud-native warehousing and open table formats like Iceberg for scalability and cost efficiency. He highlights how this approach, combined with streaming ingest via OpenTelemetry, Kafka-backed durability, curated/columnarized tables, and query orchestration, addresses common pain points such as fragmented tools, high costs, and data silos. The system delivers low-latency, interactive troubleshooting across logs, metrics, and traces at petabyte scale. Leverich also details the practicalities of organizing telemetry by use case to minimize read amplification and the significance of Iceberg v3's JSON shredding capabilities, enabling a "your data in your lake" strategy.

Key takeaway

For CTOs and AI Architects evaluating observability solutions, consider lakehouse architectures like Observe's approach. This strategy centralizes diverse telemetry data, reduces costs, and enhances troubleshooting by leveraging open table formats and streaming ETL. Your teams can gain unified access to petabytes of data, improving MTTR and enabling advanced AI-driven analytics without the typical constraints of fragmented, expensive legacy systems.

Key insights

Lakehouse architectures can provide scalable, cost-efficient observability by centralizing diverse telemetry data.

Principles

Method

Ingest OpenTelemetry data via Kafka for durability and batching, then stream-process into curated, columnarized Iceberg tables. Abstract SQL queries into optimized sequences for interactive performance.

In practice

Topics

Best for: CTO, VP of Engineering/Data, AI Architect, Data Engineer, MLOps Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Data Engineering Podcast.