TimeLens: On-Device Artifact Recognition with Retrieval-Augmented Question Answering for the Grand Egyptian Museum

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Intermediate, quick

Summary

TimeLens is an AI-powered bilingual mobile guide developed for the Grand Egyptian Museum (GEM), enabling visitors to point their phone at an exhibit for real-time artifact recognition and follow-up questions in English or Arabic. This system addresses challenges like fine-grained visual similarity among 51 catalogued artifacts, the disparity between curated training data and handheld camera conditions, and the need to prevent unsupported historical claims. Its engineering contributions include an on-device artifact detector, a 5.97 MB TensorFlow Lite YOLOv8n model, achieving mAP@0.5 = 0.995 and mAP@0.5:0.95 = 0.924, developed through a data-quality-driven iteration process. Additionally, a bilingual Retrieval-Augmented Generation (RAG) guide, grounded in a 108-record ChromaDB knowledge base, utilizes Gemma 4 E2B (Q4 K M) and incorporates ten optimizations to reduce end-to-end latency from over 30 seconds to approximately 10 seconds. Both subsystems are integrated into a production Flutter application featuring a bilingual interface, museum location gating, and text-to-speech support.

Key takeaway

For AI Engineers developing on-device AI applications for complex visual environments or real-time Q&A, prioritize data quality and iterative refinement in your vision model development. The TimeLens project demonstrates that meticulous label quality is decisive for achieving high accuracy with efficient models like YOLOv8n. Additionally, implement targeted optimizations for Retrieval-Augmented Generation (RAG) systems to reduce end-to-end latency, ensuring a responsive user experience. Consider a hybrid architecture combining on-device vision with a grounded RAG knowledge base for factual accuracy and bilingual support.

Key insights

On-device artifact recognition and RAG can deliver real-time, accurate, bilingual museum guides, overcoming data and latency challenges.

Principles

Method

Develop on-device detectors via data-quality iteration (auto-annotation, label-cleaning, hand-annotation). Integrate with a RAG system using a local knowledge base and optimized LLM for bilingual Q&A.

In practice

Topics

Best for: Computer Vision Engineer, NLP Engineer, AI Scientist, AI Engineer, Machine Learning Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.