How Amazon Finance streamlines regulatory inquiries by using generative AI on AWS
Summary
Amazon's Finance Technology (FinTech) teams have developed a scalable AI application using Amazon Bedrock and other AWS services to streamline regulatory inquiry processing. This system addresses challenges like knowledge fragmentation, conversational context management, and observability by implementing Retrieval Augmented Generation (RAG) with Amazon Bedrock Knowledge Bases and Amazon OpenSearch Serverless for vector storage. The solution features an automated document ingestion pipeline that processes various formats (PDF, PPT, Word, CSV) and uses hierarchical chunking with Amazon Titan Text Embeddings. A real-time chat application, powered by Claude Sonnet 4.5 via the Converse Stream API and Amazon DynamoDB for conversation history, enables multi-turn, context-aware interactions. Comprehensive observability is achieved through OpenTelemetry and a self-hosted Langfuse instance, providing detailed traces for continuous improvement and compliance.
Key takeaway
For AI Architects designing enterprise-grade knowledge management systems, consider adopting a RAG-based architecture on AWS. Your solution should incorporate Amazon Bedrock Knowledge Bases for efficient document ingestion and retrieval, leverage multi-turn conversational capabilities with models like Claude Sonnet 4.5, and integrate robust observability tools such as OpenTelemetry and Langfuse to ensure compliance, detect hallucinations, and enable continuous performance tuning.
Key insights
Amazon FinTech uses RAG on AWS to automate regulatory inquiry responses, enhancing retrieval, conversation, and observability.
Principles
- Hierarchical chunking improves context for complex documents.
- Query expansion enhances retrieval for abbreviated user questions.
- Observability is crucial for AI system trust and improvement.
Method
The system ingests documents into Amazon Bedrock Knowledge Bases, uses Claude 3.5 Haiku for query expansion, retrieves context via OpenSearch Serverless, and generates responses with Claude Sonnet 4.5, maintaining state in DynamoDB.
In practice
- Implement hierarchical chunking for structured documents.
- Use query expansion to handle varied user inputs.
- Integrate OpenTelemetry for end-to-end AI workflow tracing.
Topics
- Amazon FinTech
- Regulatory Inquiry Automation
- Generative AI on AWS
- Retrieval-Augmented Generation
- Amazon Bedrock Knowledge Bases
Best for: AI Engineer, MLOps Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.