Spotlighting The World Factbook as We Bid a Fond Farewell
Summary
The Central Intelligence Agency (CIA) has announced the discontinuation of "The World Factbook," one of its longest-running public intelligence publications, which had been available since 1971 and online since 1997. The CIA not only ceased maintaining the publication but also removed its entire website, including historical archives, redirecting all previous pages to a closure announcement. This action is described as "cultural vandalism" given the Factbook's public domain status and historical value. Fortunately, annual zip file archives of the site, up to 2020, are preserved on the Internet Archive. One individual has extracted the 384MB 2020 archive into a GitHub repository, making it browsable via GitHub Pages, thereby preserving access to its extensive data on 267 world entities.
Key takeaway
For data scientists or researchers relying on open-source geopolitical data, the discontinuation of "The World Factbook" necessitates a shift to archived versions. You should prioritize verifying data sources and consider contributing to community-led archiving efforts to ensure the long-term availability of such critical public domain information.
Key insights
The CIA has sunset "The World Factbook," prompting community efforts to preserve its public domain archives.
Principles
- Public domain content should remain accessible.
- Digital archives are crucial for historical data.
Method
Download annual zip archives of "The World Factbook" from the Internet Archive, extract them, and host them on platforms like GitHub Pages for continued public access.
In practice
- Mirror valuable public domain datasets.
- Utilize Internet Archive for historical data retrieval.
Topics
- The World Factbook
- Data Archiving
- Public Information Access
- CIA Publications
- Geopolitical Data
Code references
Best for: Software Engineer, Data Scientist, Tech Journalist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Simon Willison's Weblog.