Operation Breaks
Scope
The target is states in which operational assumptions for continuously running software can no longer be maintained. What is addressed here is not problems of specific operational tools or environments, but types of breakdown observed during operations.
Definition
Operations breaks describes a state in which, regarding operational acts such as deployment, rollback, observation, and configuration changes, assumptions that they can be executed safely and reproducibly are partially collapsed, and situations are observed where changes and failure response cannot be controlled as assumed.
Symptoms
- Deployment is treated as a one-time operation and cannot be reproduced
- What is happening cannot be immediately grasped when failures occur
- Assumptions hold where rollback exists only in theory and cannot be executed in actual operations
- Situations are observed where configuration changes do not remain as code or history, and behavior is environment-dependent
Typical Triggers
- States continue where operational procedures are treated as implicit knowledge and documentation or sharing is not done
- Observation and logs are added after the fact and are not treated as operational assumptions
- Emergency response and exception handling become permanent and are incorporated as assumptions of normal flow
- Changes are layered while the boundary between configuration and code remains unclear
Diagnostic Questions
- Is it a state where the currently running state can be explained in a reproducible form?
- Is it a state where necessary information can be immediately obtained during failures?
- Is it a state where it can be confirmed that rollback is actually executable?
- Is it a state where history and scope of impact of configuration changes can be tracked?
What This Is Not
- This is not a problem of selecting specific operational tools or cloud services
- This does not refer to lack of attention or effort by operations personnel
- This does not refer to single operational errors
Connections
- Why It Breaks: Context Erosion, Measurement Gap