Introducing spaCy v3.1
Summary
spaCy v3.1 has been released, building on the successful adoption of spaCy v3, which introduced transformer-based pipelines and a new training system. This latest version significantly enhances its capabilities by adding the ability to utilize predicted annotations during the training process, a feature crucial for iterative model improvement. It also includes a new component specifically designed for predicting arbitrary and potentially overlapping spans within text, offering greater flexibility for complex entity recognition and information extraction tasks. Furthermore, spaCy v3.1 expands its linguistic support with the introduction of new pre-trained pipelines for both Catalan and Danish languages, making the framework more versatile for multilingual natural language processing applications.
Key takeaway
For NLP Engineers working with spaCy, upgrading to v3.1 is recommended to utilize its new features. You can now integrate predicted annotations into your training loops for more robust model development and employ the new component for identifying complex, overlapping text spans. This update also provides immediate support for Catalan and Danish, expanding your multilingual project capabilities.
Key insights
spaCy v3.1 enhances NLP capabilities with advanced training features and expanded language support.
In practice
- Use predicted annotations for iterative model training.
- Identify arbitrary and overlapping text spans.
- Process Catalan and Danish language texts.
Topics
- spaCy
- NLP Frameworks
- Transformer Pipelines
- Model Training
- Span Annotation
- Multilingual Support
Best for: NLP Engineer, Machine Learning Engineer, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Explosion · Developer tools and consulting for AI, Machine Learning and NLP - Explosion.ai.