What Breaks
Software breaks in various forms.
In the field, they are often treated as individual bugs or incidents,
and are often discussed without the ways of breaking themselves being explicitly articulated.
This chapter takes "how software breaks" as a starting point
and organizes what kinds of breakdowns are currently observed.
The purpose here is not to indicate causes or solutions,
but to make it possible to point to what kinds of breakdowns are currently observed
in common language.
Classification of Ways of Breaking
-
Data Breaks
A state where the source of truth does not uniquely determine, duplication and omission are observed,
and processing that assumes consistency no longer holds. -
State Breaks
A state where omissions in state transitions and proliferation of flags are observed,
and divergence between screen and internal state occurs. -
Time Breaks
A state where inconsistencies such as double execution are observed
due to concurrent execution, order reversal, and interference between retry and side effects. -
Boundary Breaks
A state where boundary assumptions have collapsed
due to external API changes, timeouts, input contamination, etc. -
Responsibility Breaks
A state where the decision-making subject becomes unclear, decisions are dispersed,
and reasons for change are mixed. -
Operation Breaks
A state where deployment and rollback become difficult,
and accidents due to unobservability or configuration occur.
Each page primarily addresses the broken state
and symptoms observed in the field.
Cause analysis and recovery methods are not addressed in this chapter.
Summary of Ways of Breaking (Reference Map)
The following is a list that corresponds the six ways of breaking addressed in this chapter
with the "implicit assumptions" each depends on.
The assumptions shown here are not for determining causes,
but are used as reference points for organizing and pointing to broken states.
| Way of Breaking | Assumption That Has Collapsed |
|---|---|
| Data Breaks | The source of truth uniquely determines |
| State Breaks | State and transitions can be made explicit |
| Time Breaks | Execution order and count can be controlled |
| Boundary Breaks | Boundary assumptions are protected |
| Responsibility Breaks | Decisions and responsibility can be tracked |
| Operation Breaks | Operations can be reproduced and controlled |
These ways of breaking are not mutually independent,
and breakdown of one assumption often induces other ways of breaking.
This chapter first concentrates on questioning what assumptions have currently collapsed
and organizing what kinds of ways of breaking are observed.
Notes
This chapter concentrates on observation and classification of broken states.
Discussions such as why they occur and how they are repaired
are not addressed in this chapter.
Causes and background are addressed in the Why It Breaks chapter,
concrete structures of failure in the Failure Patterns chapter,
and ways of thinking for restoring decisions under uncertain assumptions
in the Restoring Decision-Making chapter.