Huntington Bank: Redacting sensitive data from 400M+ documents with AWS
Summary
Huntington National Bank, a top 10 US bank, successfully addressed the challenge of redacting sensitive customer data from over 400 million documents accumulated since 2015. For a proactive compliance initiative in 2025, the bank designed a scalable redaction workflow using AWS services including Amazon Textract, Amazon SageMaker, AWS Step Functions, and AWS Lambda. This solution reduced the estimated processing time from years to months, achieving a processing rate of approximately 10 million documents per day. The project also met critical requirements for data encryption, strict access controls, PCI DSS compliance, and on-premises replication, while exceeding 95% redaction accuracy. The overall cost was approximately 5% of the original estimate, and Huntington plans to reuse this framework for future high-volume redaction needs like mergers and acquisitions.
Key takeaway
For MLOps Engineers or AI Architects tasked with large-scale sensitive data redaction for compliance, you should consider a serverless, ML-driven architecture on AWS. Huntington Bank's success demonstrates that combining services like Amazon Textract, AWS Step Functions, and AWS DataSync can reduce processing timelines from years to months and costs by 95%, while achieving over 95% accuracy. This approach provides a robust framework for handling hundreds of millions of documents securely and efficiently.
Key insights
Huntington Bank scaled sensitive data redaction for 400M+ documents using AWS services, reducing time and cost significantly.
Principles
- Data must be encrypted at rest and in transit.
- Services used must be in-scope for PCI DSS compliance.
- Redaction accuracy must meet or exceed 95% for compliance.
Method
Transfer documents to S3 via AWS DataSync, detect sensitive data with Amazon Textract orchestrated by AWS Step Functions' distributed map state, then redact and sync back on-premises.
In practice
- Use AWS DataSync for secure on-premises to S3 data migration.
- Orchestrate document processing with AWS Step Functions for scalability.
- Leverage Amazon Textract for automated sensitive data detection.
Topics
- Data Redaction
- Amazon Textract
- AWS Step Functions
- AWS DataSync
- PCI DSS Compliance
- Financial Services
Best for: AI Engineer, MLOps Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.