Data Breaks
Scope
The target is situations where, regarding data handled in software, states are observed where the "Source of Truth" cannot be uniquely determined.
Definition
Data breaks refers to a state in which, regarding information that should originally be consistent, the location of the Source of Truth cannot be uniquely determined, and duplication, omission, and inconsistency are observed coexisting.
Symptoms
- Values with the same meaning exist in multiple places and do not match
- Updated information is not reflected in specific screens or processing
- Required data is missing, yet processing proceeds
- Integrity violations are treated not as exceptions but as normal operations
Typical Triggers
- Copies and caches introduced as temporary measures are incorporated into operations as is
- Assumptions made during migration are only partially shared
- Situations are observed where data is rewritten through unexpected paths by external integration or manual operations
- Situations are observed where reference and update points increase without the location of the source of truth being made explicit
Diagnostic Questions
- Is it a state where the source of truth of this value can be immediately answered?
- Is it a state where data with the same meaning is updated in multiple places?
- Is it a state where omission and inconsistency are assumptions rather than "exceptions"?
- Is it a state where the scope of impact can be enumerated during fixes?
What This Is Not
- This does not discuss the appropriateness of normalization or denormalization
- This is not a problem of choosing specific DB technologies or data stores
- This does not refer to single input errors or accidental failures
Connections
- Why It Breaks: Context Erosion, Measurement Gap