Skip to main content

Data Breaks

Scope

The target is situations where, regarding data handled in software, states are observed where the "Source of Truth" cannot be uniquely determined.

Definition

Data breaks refers to a state in which, regarding information that should originally be consistent, the location of the Source of Truth cannot be uniquely determined, and duplication, omission, and inconsistency are observed coexisting.

Symptoms

  • Values with the same meaning exist in multiple places and do not match
  • Updated information is not reflected in specific screens or processing
  • Required data is missing, yet processing proceeds
  • Integrity violations are treated not as exceptions but as normal operations

Typical Triggers

  • Copies and caches introduced as temporary measures are incorporated into operations as is
  • Assumptions made during migration are only partially shared
  • Situations are observed where data is rewritten through unexpected paths by external integration or manual operations
  • Situations are observed where reference and update points increase without the location of the source of truth being made explicit

Diagnostic Questions

  • Is it a state where the source of truth of this value can be immediately answered?
  • Is it a state where data with the same meaning is updated in multiple places?
  • Is it a state where omission and inconsistency are assumptions rather than "exceptions"?
  • Is it a state where the scope of impact can be enumerated during fixes?

What This Is Not

  • This does not discuss the appropriateness of normalization or denormalization
  • This is not a problem of choosing specific DB technologies or data stores
  • This does not refer to single input errors or accidental failures

Connections