Are LLMs safe?
Summary
Sachin Gururangan, a Young Investigator at Allen Institute for Artificial Intelligence and Data Science Engineer at Appuri, discussed sociotechnical approaches for training Large Language Models (LLMs) on the NLP Highlights podcast. He highlighted critical issues with current LLM training methods, which often involve pre-training transformer models on vast, indiscriminately collected internet corpora. Gururangan's research, including his PhD work at the University of Washington, emphasizes understanding the relationship between LLM behavior and training data, advocating for greater attention to language variation and post-training customization. He specifically detailed how "quality filters" used in datasets like GPT-3's can inadvertently introduce biases, over-representing content from well-resourced, urban, and wealthy areas, while implicitly disfavoring content from rural or less-resourced regions. This leads to models that are not truly "general purpose" but rather constrained by the implicit ideologies of their curators.
Key takeaway
For AI Scientists and Research Scientists developing or deploying LLMs, recognize that current training practices embed biases through data curation. Prioritize customization and adaptation strategies, such as multi-stage adaptive pre-training or task arithmetic, to tailor models for specific domains and mitigate unintended biases. Focus on curating high-quality, domain-relevant data, as this is key to efficient scaling and achieving desired model capabilities, even for high-resource teams.
Key insights
LLM training data curation, especially via quality filters, embeds implicit biases that shape model behavior and capabilities.
Principles
- No truly general-purpose LLM exists due to inherent data curation subjectivity.
- Customization and adaptation are crucial for LLMs to meet diverse use cases.
- Data quality and relevance are paramount for efficient and effective LLM training.
Method
Adaptive pre-training involves multi-stage adaptation, first to a broad domain, then to increasingly specific task data. Task arithmetic allows merging or interpolating "task vectors" (weight differences) to compose new model behaviors like non-toxic chat.
In practice
- Use parameter-efficient techniques (e.g., adapters) for low-resource adaptation.
- Employ retrieval-augmented generation (RAG) for black-box model adaptation.
- Consider task arithmetic to combine desired model behaviors modularly.
Topics
- LLM Customization
- Data Curation Bias
- Task Arithmetic
- Retrieval-Augmented Generation
- LLM Data Governance
Best for: AI Scientist, Research Scientist, CTO, AI Researcher, Machine Learning Engineer, Data Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by NLP Highlights.