The Contract-driven Data Platform
Summary
Andrew Jones, creator of data contracts and Principal Engineer at Springer Nature, advocates for a "contract-driven data platform" to address the complexities of traditional data management. This approach, which he introduced in 2021 and detailed in his 2024 book, redefines data quality, ownership, and governance by treating data as a product with explicit, enforceable interfaces. Traditional platforms suffer from disparate datasets, inconsistent governance, and high costs for both data consumers and producers, hindering data sharing. A contract-driven platform, in contrast, uses machine- and human-readable data contracts to capture metadata, enabling automation, consistency, and embedded governance. This shifts focus from point solutions to generic platform capabilities, allowing teams to build interoperable data products easily and cheaply, reducing cognitive load and increasing data applicability across an organization.
Key takeaway
For MLOps Engineers and AI Architects building data-intensive applications, adopting a contract-driven data platform can significantly reduce operational overhead and improve data reliability. By defining explicit data contracts, your teams can automate data quality, governance, and provisioning, ensuring a stable and consistent data interface for AI models and data products. This approach minimizes the cognitive load on data producers and consumers, accelerating development and deployment of trustworthy data solutions.
Key insights
Data contracts enable a "contract-driven data platform" for consistent, interoperable, and governed data products.
Principles
- Automate common data tasks.
- Ensure data consistency via standard tooling.
- Embed governance into data tooling.
Method
Define data contracts in a human- and machine-readable format (e.g., YAML, Jsonnet) with change management and versioning. Use these contracts to provision data warehouse tables and automate data quality and governance tasks.
In practice
- Use Open Data Contract Standard (ODCS) for contract definition.
- Implement versioning and compatibility checks for data contracts.
- Automate data quality checks and PII management via contract metadata.
Topics
- Data Contracts
- Contract-driven Data Platform
- Data Product Design
- Data Governance Automation
- Metadata Management
Best for: Data Engineer, MLOps Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Modern Data 101.