McKenzie Marshall: NLP in Asset Management (Barings)

2019-07-06 · Source: Explosion · Developer tools and consulting for AI, Machine Learning and NLP - Explosion.ai · Field: Finance & Economics — Capital Markets & Investment Management, FinTech & Digital Financial Services · Depth: Intermediate, long

Summary

McKenzie Marshall of Barings discusses the application of Natural Language Processing (NLP) in active asset management, emphasizing its role in augmenting investment research to generate "alpha" by efficiently processing extensive qualitative data like news and regulatory documents. Barings' approach focuses on assisting analysts, not automating their roles. The productionization of NLP solutions involves three sequential subtasks: identifying companies in documents (a Named Entity Recognition problem), rectifying these entities to internal IDs using algorithmic solutions, and providing an additive ranking metric, initially a sentiment score. Key challenges in NER included training a custom "company" label to filter irrelevant entities like regulatory bodies or platform products used as verbs, and distinguishing company names from general acronyms or product modifiers. Sentiment analysis presented difficulties due to the prevalence of neutral text, the need for polarity conventions, and the variability of journalistic versus regulatory language. Focused annotation management was crucial for specialized, high-value models.

Key takeaway

For Machine Learning Engineers building NLP solutions in finance, prioritize augmenting human analysts over full automation. Your success hinges on meticulously training custom Named Entity Recognition models with focused annotation management. This ensures precise identification of domain-specific entities. Combine NLP outputs with robust rules-based entity rectification. Simplify sentiment scores into clear buckets (good/bad/neutral) to deliver pragmatic, high-value tools that directly support investment research processes.

Key insights

Effective NLP in asset management augments human analysis by precisely identifying and contextualizing company-specific information from diverse text sources.

Principles

Augment, don't automate, human analytical processes.
Specialized annotation management drives business value.
Algorithmic cleaning enhances NLP pipeline pragmatism.

Method

Productionizing NLP for text consumption involves identifying companies via custom NER, rectifying entities to internal IDs with rules-based fuzzy matching, and deriving relevance metrics like sentiment from polar sentences.

In practice

Train custom NER labels for domain-specific entities.
Combine NLP with rules-based algorithms for entity resolution.
Bucket sentiment scores (good/bad/neutral) for clarity.

Topics

Natural Language Processing
Asset Management
Named Entity Recognition
Sentiment Analysis
Data Annotation
Entity Resolution

Best for: NLP Engineer, Machine Learning Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Explosion · Developer tools and consulting for AI, Machine Learning and NLP - Explosion.ai.