How I Engineered a 1.7M Row Product Safety Database to Serve Pages in 66ms on a Single Free VM
Summary
The article describes how LemonKnows, a consumer safety platform tracking 1.7M products, 41,000 global government recalls, and 100,000+ community complaints, achieved 66ms page assembly times on a free Oracle Cloud VM. This was accomplished by addressing common database performance bottlenecks. Key strategies included replacing real-time COUNT(*) queries with an eventually consistent stats_cache lookup table for pagination, optimizing indexing by using a single composite index tailored to specific query patterns, and implementing a zero-cost multi-language AI translation layer. The translation layer uses a content-addressed cache, translating unique English source texts once via Gemini 2.5 Flash Lite and serving subsequent requests from local storage. Additionally, a database-level unique constraint was used for user-generated content image abuse prevention.
Key takeaway
For software engineers optimizing database performance on constrained resources, prioritize eventual consistency for non-critical aggregates to avoid COUNT(*) bottlenecks. Design composite indexes precisely matching your query patterns to prevent partial sequential scans. Implement content-addressed caching for LLM translations to eliminate recurring API costs and latency. This approach significantly reduces infrastructure demands and improves user experience.
Key insights
Optimizing large-scale database performance and cost on minimal infrastructure requires strategic data management and query pattern alignment.
Principles
- Replace real-time COUNT(*) with pre-aggregated caches.
- Composite indexes must mirror WHERE and ORDER BY clauses.
- Cache LLM translations content-addressed for zero recurring cost.
Method
Implement stats_cache for counts, use composite indexes mirroring query patterns, and cache LLM translations via SHA-256 hashes. Enforce UGC limits with database-level unique constraints.
In practice
- Replace COUNT(*) with stats_cache for pagination.
- Design composite indexes matching WHERE and ORDER BY sequences.
Topics
- PostgreSQL Performance
- Composite Indexing
- LLM Translation Caching
- Database Optimization
- Oracle Cloud Free Tier
- Eventual Consistency
Best for: Software Engineer, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by HackerNoon.