Spotlighting The World Factbook as We Bid a Fond Farewell

· Source: Simon Willison's Weblog · Field: Technology & Digital — Data Science & Analytics, Software Development & Engineering, Emerging Technologies & Innovation · Depth: Fundamental Awareness, quick

Summary

The Central Intelligence Agency (CIA) has announced the discontinuation of "The World Factbook," one of its longest-running public intelligence publications, which had been available since 1971 and online since 1997. The CIA not only ceased maintaining the publication but also removed its entire website, including historical archives, redirecting all previous pages to a closure announcement. This action is described as "cultural vandalism" given the Factbook's public domain status and historical value. Fortunately, annual zip file archives of the site, up to 2020, are preserved on the Internet Archive. One individual has extracted the 384MB 2020 archive into a GitHub repository, making it browsable via GitHub Pages, thereby preserving access to its extensive data on 267 world entities.

Key takeaway

For data scientists or researchers relying on open-source geopolitical data, the discontinuation of "The World Factbook" necessitates a shift to archived versions. You should prioritize verifying data sources and consider contributing to community-led archiving efforts to ensure the long-term availability of such critical public domain information.

Key insights

The CIA has sunset "The World Factbook," prompting community efforts to preserve its public domain archives.

Principles

Method

Download annual zip archives of "The World Factbook" from the Internet Archive, extract them, and host them on platforms like GitHub Pages for continued public access.

In practice

Topics

Code references

Best for: Software Engineer, Data Scientist, Tech Journalist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Simon Willison's Weblog.