Evaluating Post-hoc Explanations of the Transformer-based Genome Language Model DNABERT-2

· Source: Machine Learning · Field: Science & Research — Life Sciences & Biology, Mathematics & Computational Sciences · Depth: Expert, quick

Summary

Researchers adapted AttnLRP, an extension of layer-wise relevance propagation, to evaluate post-hoc explanations for the Transformer-based genome language model (gLM) DNABERT-2. This work addresses whether explanations, previously effective for convolutional neural networks (CNNs) on genome sequences, transfer to more expressive Transformer architectures. The study proposes strategies to transfer explanations from token and nucleotide levels and evaluates AttnLRP on genomic datasets using multiple metrics. An extensive comparison between DNABERT-2's explanations and a baseline CNN was conducted. The findings indicate that AttnLRP produces reliable explanations that align with known biological patterns, demonstrating that gLMs, similar to CNNs, can facilitate the derivation of biological insights.

Key takeaway

For AI Scientists and Research Scientists developing or applying genome language models, understanding their interpretability is crucial. This research confirms that Transformer-based models like DNABERT-2 can yield biologically meaningful explanations using adapted methods like AttnLRP. You should consider integrating such post-hoc explanation techniques to validate model predictions and generate new biological hypotheses, moving beyond mere predictive performance.

Key insights

AttnLRP reliably explains Transformer-based genome language models, revealing biological patterns similar to CNNs.

Principles

Method

AttnLRP, an extension of layer-wise relevance propagation, was adapted for Transformer attention mechanisms and applied to DNABERT-2, with strategies for token and nucleotide level explanation transfer.

In practice

Topics

Best for: AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.