EVE: A Domain-Specific LLM Framework for Earth Intelligence

2026-04-16 · Source: cs.AI updates on arXiv.org · Field: Science & Research — Environmental Science & Earth Systems, Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

Earth Virtual Expert (EVE) is an open-source, end-to-end framework for developing and deploying domain-specialized Large Language Models (LLMs) for Earth Intelligence (EI). The core component, EVE-Instruct, is a 24B parameter model built on Mistral Small 3.2, optimized for reasoning and question answering in Earth Observation (EO) and Earth Sciences. EVE includes curated training corpora (2.8B open-access tokens, 10.7B synthetic instruction tokens) and the first systematic domain-specific evaluation benchmarks, covering multiple-choice QA, open-ended QA, and factuality. The system integrates Retrieval-Augmented Generation (RAG) and a hallucination-detection pipeline into a production system, deployed via API and GUI, and has supported 350 pilot users. All models, datasets, and code are openly released on Hugging Face and GitHub, demonstrating strong performance on domain-specific tasks while preserving general capabilities.

Key takeaway

For AI Engineers developing domain-specific LLMs, EVE demonstrates that a targeted approach combining domain adaptation, curated data, and robust evaluation can yield superior performance without relying on larger models. You should consider adopting a similar end-to-end framework, including synthetic data generation and a RAG pipeline with hallucination detection, to build reliable and efficient specialized AI systems for complex scientific domains.

Key insights

EVE provides an open, end-to-end framework for domain-specialized LLMs in Earth Intelligence, outperforming general models.

Principles

Domain adaptation improves performance without increasing model size.
Interleaving instruction and long-form text preserves general capabilities.
RAG and hallucination detection enhance factual reliability.

Method

EVE fine-tuned Mistral Small 3.2 using a mixed data strategy, combining general-domain replay with synthetic EO/Earth Sciences text, then applied Online Direct Preference Optimization for alignment.

In practice

Use a two-pass chunking strategy for RAG documents.
Employ LLM-as-a-judge for open-ended QA evaluation.
Implement rolling summarization for conversational memory.

Topics

EVE Framework
Earth Intelligence
Domain-Specific LLMs
Earth Observation
Retrieval-Augmented Generation

Code references

Best for: AI Engineer, AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.