Gate-level emulation of an Intel 4004 in 4004 bytes of C
Summary
Nicholas Carlini's winning entry in the International Obfuscated C Code Contest is a feature-complete, gate-level emulator for the Intel 4004 microprocessor, designed to run the original Busicom 141-pf calculator ROM. This emulator, written in under 4004 bytes of C, operates by simulating a circuit representation of the 4004, which was itself designed using the miniHDL Python DSL. The program decompresses a highly compressed circuit description, emulates its logic gates, and simulates the calculator's I/O hardware, including keypad input and print-drum output. Despite the complexity of emulating a 200,000-gate CPU, the C program achieves usable speeds of a few hundred instructions per second through optimizations like selective gate recalculation based on input changes. The 4004 is a 4-bit, accumulator-based CPU with 45 instructions and 16 general-purpose registers, originally designed for the Busicom calculator.
Key takeaway
For AI Scientists or embedded systems engineers working with extreme resource constraints, this project demonstrates that highly complex systems can be implemented in minimal footprints. You should investigate advanced compression techniques, including custom LZ-style algorithms and arithmetic encoding, combined with self-modifying code patterns, to achieve unprecedented code density for specialized hardware emulation or low-level system development. This approach can enable functionality previously deemed impossible within tight memory or binary size limits.
Key insights
A gate-level Intel 4004 emulator fits into 4004 bytes of C using extreme compression and self-modifying code.
Principles
- Compression can enable complex systems within severe size constraints.
- Reusing existing circuit components saves space in compressed designs.
- Asymptotic complexity matters less if N is small enough.
Method
The method involves LZ-style and arithmetic encoding for circuit compression, delta-encoding for gate references, and a custom REPEAT instruction for sub-circuit duplication, all within a self-modifying C program for gate-level emulation.
In practice
- Explore custom LZ-style compression for specific data types.
- Consider self-modifying code for extreme size optimization.
- Use delta-encoding for relative addressing in circuit descriptions.
Topics
- Intel 4004 Emulation
- Gate-Level Simulation
- Code Obfuscation
- Data Compression
- Hardware Description Languages
Code references
Best for: AI Scientist, Software Engineer, Research Scientist, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Nicholas Carlini.