Gwtar: a static efficient single-file HTML format

· Source: Simon Willison's Weblog · Field: Technology & Digital — Software Development & Engineering, Emerging Technologies & Innovation · Depth: Intermediate, quick

Summary

Gwtar is a new project by Gwern Branwen and Said Achmiz that addresses the challenge of creating single-file HTML archives containing numerous assets, while maintaining efficient browser viewing. It achieves this by employing a unique mechanism: the page first calls `window.stop()` to halt full document download, then embeds uncompressed tar content inline. Subsequent asset requests are handled via HTTP range requests, fetching data on-demand from the embedded tar. JavaScript rewrites asset URLs to `https://localhost/` to trigger failed loads, which are then intercepted by a `PerformanceObserver`. A `resourceURLStringsHandler` callback retrieves the required resource, either from already loaded data or via a range request, and injects it into the page using a `blob:` URL, ensuring lazy loading of large media files.

Key takeaway

For web developers or content creators needing to package complex web content into a single, portable HTML file, Gwtar offers a novel solution for efficient asset loading. You should consider Gwtar if your goal is to distribute self-contained web pages with large media files without sacrificing browser performance, as it allows for lazy loading directly from the embedded archive.

Key insights

Gwtar enables efficient, lazy-loaded, single-file HTML archives using `window.stop()` and HTTP range requests.

Principles

Method

The method involves calling `window.stop()` early, embedding tar data, rewriting asset URLs to `localhost`, and using a `PerformanceObserver` to intercept failed loads and serve content via range requests or `blob:` URLs.

In practice

Topics

Best for: Software Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Simon Willison's Weblog.