Description
The Crisis Engineer: Resilience Is Not a Feature — It Is a Design Obligation
The Crisis Engineer presents a clear, urgent, and deeply relevant argument for the modern digital age: resilience can no longer be treated as an optional technical feature. It must be designed, governed, tested, and embedded into the way organisations operate. Published jointly by The Swiss Quality Consulting GmbH and Claruna Business Consulting GmbH, and authored by Mahmoud Hammoud and Bärbel Wetenkamp, the book positions crisis engineering as a serious organisational discipline rather than a reactive response to failure.
The book moves beyond the traditional view of cybersecurity as a specialised technical function managed only by IT or security teams. Instead, it shows that digital crises now sit at the intersection of architecture, governance, business continuity, human judgement, and institutional responsibility. In this context, the Crisis Engineer is not presented as a heroic individual arriving after disaster strikes. The Crisis Engineer is a disciplined function built into the system before pressure arrives.
From Technical Incidents to Systemic Collapse
One of the strongest contributions of The Crisis Engineer is its reframing of digital failure. In the past, organisations often treated outages as isolated technical events: a server went down, a software update failed, or a service became temporarily unavailable. That understanding no longer fits the way modern organisations operate. Today, businesses depend on complex networks of cloud platforms, SaaS tools, identity services, APIs, analytics providers, and third-party integrations. When one part fails, the impact can quickly travel across many connected systems.
The book uses the concept of blast radius to explain this reality. Rather than focusing only on the original cause of failure, blast radius examines how far disruption spreads after the first incident begins. This shift is important because the damage caused by a crisis is often shaped less by the initial fault and more by the structure through which that fault travels. A small technical issue can become a major operational crisis if the system is tightly connected, poorly governed, or difficult to contain.
Delegated Trust and the Hidden Expansion of Risk
The Crisis Engineer also examines how modern organisations extend trust beyond their own walls. Tools such as OAuth, SaaS integrations, API connections, and external platforms allow businesses to move faster and operate more efficiently. However, they also create new forms of dependency and exposure. The book explains that delegated trust is not only a technical decision. It is also a governance decision.
Once access is granted through tokens, integrations, or connected services, the organisation must be able to monitor, control, and revoke that access effectively. Multi-factor authentication may protect the login moment, but it does not automatically manage everything that happens after access has been authorised. This distinction is central to the book’s argument. Security does not fail only when a protocol breaks. It can also fail when legitimate trust pathways are poorly understood, weakly governed, or allowed to expand without sufficient visibility.
Degraded Mode and the Return of Human Responsibility
One of the most practical sections of the book focuses on what actually happens when systems become unavailable. Organisations do not simply stop working during a crisis. They enter degraded mode. Automated processes are replaced by manual workarounds. Staff must make decisions under pressure. Teams must coordinate without the tools they usually depend on. In these moments, resilience becomes a human and organisational test, not only a technical one.
The book’s discussion of the 2017 WannaCry attack and its impact on NHS trusts is especially powerful. Mahmoud Hammoud’s first-hand reflection shows that containment was not only achieved through automated systems. It required people to act quickly, disconnect machines, isolate networks, and make difficult decisions with incomplete information. Bärbel Wetenkamp’s reflection adds another important layer: when systems disappear, direct human responsibility becomes visible again.
Why Fragility Is Often Designed In
The Crisis Engineer makes a strong case that many failures are not accidental surprises. They are the predictable result of systems designed mainly for efficiency, speed, and integration, but not for stress, uncertainty, or recovery. The book explores concepts such as gray failure, where systems appear to be functioning while degradation is already spreading beneath the surface. These are dangerous conditions because traditional health checks may not detect the problem early enough.
Identity and Access Management is another major example. Over time, large organisations accumulate roles, permissions, accounts, and policies. Under normal conditions, this complexity may seem manageable. During a crisis, it can become a serious obstacle. Response teams may struggle to know which access paths are safe, which are critical, and which must be shut down. This creates a difficult choice: revoke too much access and risk stopping essential services, or leave access open and risk further propagation.
Business Continuity Must Be Alive
The book is also clear that business continuity cannot remain a static document stored for audit purposes. Continuity planning only matters if it works under real pressure. A Business Impact Analysis should not be treated as a one-time formality. It should be a living process that is tested, updated, and connected to the realities of operations.
The Crisis Engineer shows that cyber incidents quickly become business incidents when digital infrastructure is tied to revenue, supply chains, customer service, finance, and regulatory obligations. A ransomware attack does not only affect data or systems. It can interrupt sales, damage trust, delay payments, create legal exposure, and threaten organisational survival. This is why crisis engineering must involve business leaders, not only technical teams.
Decision-Making Under Pressure
Another important strength of the book is its attention to human judgement. During a crisis, people are expected to make high-stakes decisions under time pressure, uncertainty, fatigue, and emotional strain. The book explains that decision quality naturally declines when cognitive load increases. This is not a personal weakness. It is a predictable human response to prolonged stress.
For this reason, structure becomes a form of protection. Clear escalation paths, rehearsed playbooks, defined authority, and trusted communication channels help preserve decision quality when pressure rises. Leadership during crisis is not only about motivation or confidence. It is about maintaining reliability when individual and organisational capacity are being stretched.
The Human Firewall Must Be Designed, Not Assumed
The Crisis Engineer also challenges the common belief that awareness training alone can solve human security risk. People do not behave perfectly under stress, distraction, overload, or uncertainty. The book argues that organisations must design systems around real human behaviour, not ideal human behaviour.
This means reducing unnecessary complexity, making safe actions easier, building helpful defaults, and designing workflows that account for error. If an organisation depends on people never being tired, confused, rushed, or distracted, then the system itself is fragile. A strong human firewall is not created by blame. It is created by thoughtful design, realistic governance, and environments that help people make better decisions.
Crisis Engineering as a Discipline
The closing argument of The Crisis Engineer is that resilience is not something organisations can buy, install, or claim through policy language. It is a living capability. It depends on the ability to respond, monitor, anticipate, and learn. It requires technical architecture, business continuity, governance, leadership, and human behaviour to work together.
The book does not offer easy reassurance. It offers discipline. It asks organisations to stop treating recovery as improvisation and start treating it as a designed capability. The Crisis Engineer is therefore not a symbolic role. It is a necessary function for any organisation that depends on systems that must remain understandable, governable, and recoverable when normal operations collapse.
Who This Book Is For
The Crisis Engineer is essential reading for CISOs, technology leaders, enterprise architects, business continuity professionals, risk officers, governance specialists, and executives responsible for systems that must continue operating under pressure. It is also valuable for anyone who wants to understand why modern digital crises cannot be solved through technical fixes alone.
At its core, the book argues that resilience is a responsibility. It must be designed before the crisis, tested before the failure, and governed before the pressure arrives. For organisations operating in a world of constant digital dependency, this is no longer optional. It is a design obligation.
The Crisis Engineer strategic resilience cyber crisis recovery cascading failure digital resilience business continuity engineering degraded mode recovery SaaS supply chain risk blast radius architecture failure socio-technical systems crisis leadership cyber chaos
#TheCrisisEngineer #CrisisEngineering #CyberResilience #StrategicResilience #DigitalRecovery #BusinessContinuity #CascadingFailure #CyberChaos #Claruna #SwissQuality #ResilienceEngineering #TechLeadership
















