Analysing Lightweight Large Language Models for Biomedical Named Entity Recognition on Diverse Ouput Formats

· Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Life Sciences & Biology · Depth: Expert, long

Summary

An experimental analysis investigates the use of lightweight Large Language Models (LLMs) for Biomedical Named Entity Recognition (NER), addressing the computational demands and resource constraints of larger models in healthcare settings. The study evaluates the impact of twelve different output formats on model performance, including "conv_term", "single_tag", "multi_tag", "single_code", "multi_code", "single_term", "multi_term", "single_span", "multi_span", "multi_triple", "multi_bio", and "multi_brat". Results indicate that lightweight LLMs can achieve competitive performance for biomedical information extraction. Contrary to initial assumptions, instruction tuning across many distinct formats does not improve performance, but specific formats are consistently associated with better outcomes. The research highlights the potential of these smaller models as effective alternatives for resource-constrained biomedical NLP.

Key takeaway

For AI Engineers and Research Scientists developing biomedical NLP solutions, consider lightweight LLMs as viable alternatives to larger models, especially in privacy and budget-constrained environments. Focus on optimizing the output format, as specific formats like "conv_term" or "multi_code" can significantly enhance performance, rather than attempting to instruction-tune across a wide array of formats, which showed no performance improvement.

Key insights

Lightweight LLMs offer competitive biomedical NER performance, with specific output formats proving more effective than multi-format tuning.

Principles

Method

The study employs instruction tuning on Causal Language Models (CLMs), framing NER as a text generation task. It incorporates entity type, document characteristics, and specific output format into the instruction-tuning dataset.

In practice

Topics

Code references

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.