TimeLens: On-Device Artifact Recognition with Retrieval-Augmented Question Answering for the Grand Egyptian Museum
Summary
TimeLens is an AI-powered bilingual mobile guide developed for the Grand Egyptian Museum (GEM), enabling visitors to point their phone at an exhibit for real-time artifact recognition and follow-up questions in English or Arabic. This system addresses challenges like fine-grained visual similarity among 51 catalogued artifacts, the disparity between curated training data and handheld camera conditions, and the need to prevent unsupported historical claims. Its engineering contributions include an on-device artifact detector, a 5.97 MB TensorFlow Lite YOLOv8n model, achieving mAP@0.5 = 0.995 and mAP@0.5:0.95 = 0.924, developed through a data-quality-driven iteration process. Additionally, a bilingual Retrieval-Augmented Generation (RAG) guide, grounded in a 108-record ChromaDB knowledge base, utilizes Gemma 4 E2B (Q4 K M) and incorporates ten optimizations to reduce end-to-end latency from over 30 seconds to approximately 10 seconds. Both subsystems are integrated into a production Flutter application featuring a bilingual interface, museum location gating, and text-to-speech support.
Key takeaway
For AI Engineers developing on-device AI applications for complex visual environments or real-time Q&A, prioritize data quality and iterative refinement in your vision model development. The TimeLens project demonstrates that meticulous label quality is decisive for achieving high accuracy with efficient models like YOLOv8n. Additionally, implement targeted optimizations for Retrieval-Augmented Generation (RAG) systems to reduce end-to-end latency, ensuring a responsive user experience. Consider a hybrid architecture combining on-device vision with a grounded RAG knowledge base for factual accuracy and bilingual support.
Key insights
On-device artifact recognition and RAG can deliver real-time, accurate, bilingual museum guides, overcoming data and latency challenges.
Principles
- Label quality is decisive for on-device object detection.
- Iterative data-quality-driven development improves model performance.
- Targeted optimizations significantly reduce RAG system latency.
Method
Develop on-device detectors via data-quality iteration (auto-annotation, label-cleaning, hand-annotation). Integrate with a RAG system using a local knowledge base and optimized LLM for bilingual Q&A.
In practice
- Use YOLOv8n for efficient on-device object detection.
- Ground RAG with a ChromaDB knowledge base for factual accuracy.
- Optimize LLM inference for mobile RAG latency reduction.
Topics
- On-Device AI
- Artifact Recognition
- Retrieval-Augmented Generation
- Computer Vision
- Mobile Applications
- Grand Egyptian Museum
Best for: Computer Vision Engineer, NLP Engineer, AI Scientist, AI Engineer, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.