Data Contract
Also known as: data product contract, schema contract
A data contract is a formal, versioned agreement between a data producer and its consumers about the schema, semantics, freshness, and quality of a dataset — treated as code and enforced in CI rather than tribal knowledge.
Detailed explanation
Data contracts make the implicit explicit: producers commit to a schema, allowed value ranges, freshness SLAs, and ownership; consumers can depend on those guarantees and break the build if they are violated. The contract is typically a YAML or JSON file checked into Git, validated in CI, and enforced at ingest or transformation time.
Tools in this space include open-source projects like dbt contracts, data-contract-cli, and Great Expectations, plus commercial platforms. The contract integrates with the data platform (warehouse, lakehouse, streaming) to catch breaking changes before they reach downstream pipelines, dashboards, or ML features.
Data contracts pair naturally with Data Mesh — each domain publishes contracts for the products it owns — but they are valuable in any setup where multiple teams consume the same data and silent schema drift causes incidents.