Fine-tune Amazon Nova models for accurate email data extraction

2026-06-30 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cloud Computing & IT Infrastructure · Depth: Intermediate, long

Summary

Fine-tuning Amazon Nova models using Amazon SageMaker AI and PEFT (LoRA) enabled Parcel Perform to achieve up to 94.77% extraction accuracy, a 16.6 percentage point improvement over baseline, reduced inference latency by over 30%, and halved costs for email data extraction. This solution addresses common challenges like model hallucinations, confusion between similar data types, and high token costs associated with HTML-formatted emails. The process involves preparing training data in Amazon Bedrock conversation format, uploading it to Amazon S3, creating a fine-tuning job in SageMaker AI with LoRA configuration, and deploying the model via Amazon Bedrock for on-demand inference. Notably, the smaller Nova Micro model, when fine-tuned, outperformed the larger Nova Lite, demonstrating the effectiveness of task-specific optimization.

Key takeaway

For MLOps Engineers automating data extraction from diverse email formats, fine-tuning Amazon Nova models with SageMaker AI and PEFT (LoRA) can significantly improve accuracy (up to 94.77%), reduce inference latency (over 30%), and cut costs (50%). This approach allows deploying customized models on Amazon Bedrock with on-demand, token-based pricing, eliminating dedicated LLM infrastructure. You should prepare at least 1,300 training samples in Bedrock conversation format to achieve meaningful results and ensure your data represents production email variety.

Key insights

Fine-tuning Amazon Nova models with PEFT significantly boosts email data extraction accuracy and efficiency while reducing costs.

Principles

PEFT (LoRA) enables efficient model customization with limited data.
Smaller, fine-tuned models can outperform larger alternatives on specific tasks.
Modest training data increases (e.g., 1,300 samples) yield significant accuracy gains.

Method

Prepare training data in Amazon Bedrock conversation format, upload to S3, create a SageMaker AI fine-tuning job using LoRA, then deploy via Amazon Bedrock for on-demand inference.

In practice

Use Nova Micro for cost-effective, high-accuracy entity extraction.
Start with at least 1,300 training samples for meaningful results.
Deploy PEFT models with on-demand, token-based pricing on Amazon Bedrock.

Topics

Amazon Nova
Parameter-Efficient Fine-Tuning
LoRA
Amazon SageMaker AI
Email Data Extraction
Amazon Bedrock

Code references

aws-samples/amazon-nova-samples

Best for: Machine Learning Engineer, MLOps Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.