# I Built a System That Scores Oil Company Climate Claims Against NGO Evidence.
Summary
Claimify ESG is a RAG pipeline designed to score corporate climate claims from sustainability reports against a curated corpus of NGO evidence, revealing that only 3 out of 710 climate pledges made by major oil companies in 2021 were kept. The system automates the auditing of complex sustainability reports, providing traceable reasoning for each verdict. Its five-stage pipeline involves PDF ingestion, NLP filtering with ClimateBERT and GPT-4o for claim extraction, two-stage retrieval using SBERT and a cross-encoder, LLM scoring with GPT-4o-mini, and materiality adjustment. This approach achieved 86.7% accuracy on a 60-claim hand-labelled evaluation set, demonstrating a robust method for distinguishing falsifiable claims from aspirations and grounding verdicts in specific evidence.
Key takeaway
For ESG analysts or AI engineers building auditable compliance systems, this RAG pipeline offers a blueprint for robust claim verification. You should adopt a multi-stage approach, pre-filtering irrelevant text and explicitly separating classification logic to improve accuracy and reduce costs. Grounding LLM verdicts in specific, retrieved evidence ensures traceability, allowing you to provide concrete, defensible scores for complex corporate statements rather than relying on vague sentiment analysis or manual review.
Key insights
A RAG pipeline can provide traceable, evidence-backed verdicts for corporate climate claims, overcoming limitations of keyword matching or manual review.
Principles
- Ground LLM outputs with curated domain evidence.
- Pre-filter irrelevant data to optimize cost and speed.
- Separate complex classification into explicit steps.
Method
The pipeline ingests PDFs, filters non-climate sentences with ClimateBERT, extracts structured claims using GPT-4o, retrieves top-5 evidence passages via SBERT and a cross-encoder, scores claims with GPT-4o-mini, and adjusts risk scores by category materiality.
In practice
- Use ClimateBERT (distilroberta-base-climate-detector) for local climate-relevance filtering.
- Implement a two-step LLM prompt for nuanced claim classification.
- Chunk long source documents into ~350-word segments with overlap for RAG.
Topics
- RAG Pipeline
- ESG Reporting
- Climate Claims
- LLM Classification
- Corporate Sustainability
- Semantic Search
- PDF Processing
Best for: AI Engineer, Data Scientist, Consultant
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by NLP on Medium.