NCTB-QA: A Large-Scale Bangla Educational Question Answering Dataset and Benchmarking Performance

2026-03-05 · Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Advanced, quick

Summary

NCTB-QA is a new large-scale Bangla question answering dataset designed to address challenges in reading comprehension for low-resource languages, particularly regarding unanswerable questions. It comprises 87,805 question-answer pairs derived from 50 textbooks published by Bangladesh's National Curriculum and Textbook Board. The dataset features a balanced distribution of answerable (57.25%) and unanswerable (42.75%) questions, and includes adversarially designed instances with plausible distractors. Benchmarking three transformer-based models (BERT, RoBERTa, ELECTRA) on NCTB-QA demonstrated significant performance gains through fine-tuning. BERT, for example, achieved a 313% relative improvement in F1 score, increasing from 0.150 to 0.620. Semantic answer quality, as measured by BERTScore, also improved substantially across all evaluated models, establishing NCTB-QA as a challenging benchmark.

Key takeaway

For AI Scientists developing reading comprehension systems for low-resource languages, you should prioritize creating or utilizing datasets with a balanced distribution of answerable and unanswerable questions. Fine-tuning transformer models on such domain-specific, adversarially designed datasets, like NCTB-QA, is critical for achieving robust performance and significantly improving F1 scores and semantic answer quality in challenging linguistic environments.

Key insights

Domain-specific fine-tuning is crucial for robust QA performance in low-resource languages, especially with unanswerable questions.

Principles

Balanced datasets improve QA robustness.
Adversarial examples enhance model training.

Method

The method involves creating a large-scale QA dataset from educational textbooks, balancing answerable and unanswerable questions, and including adversarial distractors to benchmark transformer models.

In practice

Use NCTB-QA for Bangla QA research.
Fine-tune models on domain-specific data.

Topics

Bangla Question Answering
Low-Resource NLP
Educational Datasets
Transformer Models
Model Fine-tuning

Best for: AI Scientist, Research Scientist, AI Researcher, NLP Engineer, AI Student

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.