Introducing spaCy v2.3
Summary
spaCy version 2.3 has been released, significantly expanding the Natural Language Processing library's capabilities and efficiency. This update introduces full language models for five new languages: Chinese, Japanese, Danish, Polish, and Romanian, broadening its global linguistic coverage. Furthermore, all 15 existing model families have been updated, now incorporating word vectors. These comprehensive updates collectively lead to improved accuracy across the models, alongside a notable reduction in model size and faster loading times, particularly for models that utilize word vectors. This release enhances spaCy's utility for developers and researchers working with diverse linguistic data.
Key takeaway
For NLP Engineers working with multilingual data, spaCy v2.3 offers immediate benefits. You should update your spaCy installations to access new models for Chinese, Japanese, Danish, Polish, and Romanian, significantly broadening your project's linguistic scope. Additionally, your existing projects will see improved accuracy and faster loading times due to the updated 15 model families, enhancing overall performance and efficiency.
Key insights
spaCy v2.3 expands language support to 5 new languages and improves existing models with word vectors for better performance.
In practice
- Integrate Chinese, Japanese NLP.
- Benefit from faster model loading.
- Utilize improved model accuracy.
Topics
- spaCy
- Natural Language Processing
- Language Models
- Multilingual NLP
- Model Performance
- Word Vectors
Best for: NLP Engineer, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Explosion · Developer tools and consulting for AI, Machine Learning and NLP - Explosion.ai.