F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World
Summary
F2LLM-v2 is a new family of general-purpose, multilingual embedding models available in 8 sizes, from 80M to 14B parameters. These models support over 200 languages, with a focus on mid- and low-resource languages, and were trained on a 60 million sample dataset. Utilizing a two-stage LLM-based embedding training pipeline, matryoshka learning, model pruning, and knowledge distillation, F2LLM-v2 achieves high efficiency while maintaining competitive performance. The F2LLM-v2-14B model ranks first on 11 MTEB benchmarks, and its smaller counterparts establish new benchmarks for resource-constrained applications. All models, data, code, and intermediate checkpoints are openly released to support further research.
Key takeaway
For NLP engineers developing multilingual applications, F2LLM-v2 offers a robust solution for embedding generation. You should consider integrating F2LLM-v2 models to enhance performance and efficiency, especially for projects targeting mid- and low-resource languages. The open-source release provides an excellent opportunity to experiment with and fine-tune these models for specific use cases, potentially reducing development costs and improving language coverage.
Key insights
F2LLM-v2 offers efficient, performant, and inclusive multilingual embeddings for over 200 languages.
Principles
- Multilingual support extends to underserved languages.
- Efficiency can be achieved with competitive performance.
- Open-sourcing fosters embedding model research.
Method
A two-stage LLM-based embedding training pipeline integrates matryoshka learning, model pruning, and knowledge distillation to create efficient, performant multilingual models.
In practice
- Use F2LLM-v2-14B for top MTEB benchmark performance.
- Deploy smaller F2LLM-v2 models for resource-constrained tasks.
- Explore released code and data for custom embedding research.
Topics
- Multilingual Embeddings
- LLM-based Embeddings
- Matryoshka Learning
- Model Pruning
- Knowledge Distillation
Best for: NLP Engineer, AI Scientist, Research Scientist, AI Researcher, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.