At a district heating plant in Europe, a gas turbine destroyed itself in a flash. A severe surge – itself likely the consequence of a delayed blow-off valve opening – broke compressor blades, and the debris took out everything it could reach downstream: turbine section, burners, exhaust. The pressure wave even damaged the air intake filter house.

It was the third severe failure involving this unit. Not literally the same machine – earlier repairs and overhauls had been done by replacement – but the same construction, the same type.

When the dust settles, the operator faces a decision: repair it, replace it, or shut it down for good.

Before any path is chosen

The most expensive mistakes might happen in the first days, under pressure to "get moving." At least the following four things must be done first, in roughly this order:

  • Preserve the evidence. The failure scene is the basis of every later claim – toward the OEM, the service contractor, and the insurer. Don't dismantle, clean, or ship anything until the condition is documented and the relevant parties have inspected or formally waived inspection.

  • Notify the insurer immediately and arrange their inspection early. Machinery breakdown and business interruption coverage have notification clauses, and the insurer's findings will shape the recoverable amount. This conversation has a second dimension: after a severe failure, premiums, deductibles, exclusions may change. After multiple times, insurability itself is at stake.

  • Commission a thorough and objective root cause analysis. The OEM will investigate – its own machine and its own potential liability. For a severe event, the owner needs either independent expertise or enough in-house depth to challenge the findings objectively.

  • Stabilise the commercial position. Inform the trader and dispatch organisation with a defensible availability forecast, activate backup sources, and review supply obligations and penalty exposure. A district heating plant's commercial bleeding starts on day one and runs through every path.

Make sure you understand what the service contract covers

The first hard reality is contractual. What does the service contract cover in a catastrophic event – and what does it exclude?

Expect the OEM's position to be narrow. The classic argument: the "real" damage was the first blade that let go; everything downstream is consequential – excluded or capped under the limitation-of-liability clauses. Under that reading, the contract pays for one blade, and the owner pays for a wrecked engine, an exhaust system, a filter house, and the downtime – even after insurance.

The starting number is therefore the gap between total damage and the recoverable amount – from contract and insurance combined, after caps, deductibles, exclusions, and the inevitable negotiation. An owner who has kept disciplined operating records, maintenance documentation, and event data negotiates this gap from strength. One who hasn't, doesn't.

If previous events have already tested the agreement's performance guarantee provisions, the payment history under those clauses is also a data point – not just background context.

Root cause does matter

Was this a random event, or a systematic one?

A genuinely random failure – foreign object, isolated material defect – leaves the unit's risk profile roughly intact after proper repair. A systematic cause – a control logic weakness, an auxiliary system that responds too slowly, a design characteristic of the type, a recurring material defect, an operating regime the machine was never meant for – means the repaired unit inherits the same exposure.

Three severe failures in the same family is not bad luck, it is data. The events may not share the exact same proximate cause, but the failure history signals something about the reliability of this design. That judgement drives the reliability forecast for the repair path – and whether "replace like-for-like" is even a rational option.

The three paths – and where each one's estimates deceive

Before evaluating the three paths, the asset's role in the portfolio must be assessed. A unit carrying a district heating obligation, a reserve market commitment, or a security-of-supply role is not evaluated the same way as a marginal peaking asset.

Path 1: Repair

The repair quote is a floor, not a ceiling. Severe-damage repairs are properly scoped only after stripping – and workshop findings expand scope. Add the owner's internal costs: engineering hours, site supervision, logistics, management attention – routinely left out, but never small.

Then the harder questions. What is the realistic availability and reliability of the repaired unit, given the failure history? Where does the unit sit in its lifecycle – remaining design life, the cost and timing of the next major inspections, parts and support obsolescence for an aging type, emissions and regulatory fit over the planning horizon?

A repaired unit is not a new unit. It is a mixed-age assembly with a history – and the insurer and the dispatch organisation will both treat it that way.

Use actual reference data for the unit class and type when assessing repaired lifetime and performance expectations. Talk to peer operators, gather data from operator forums, and build an objective picture – rather than relying on the OEM's assessment alone.

Path 2: Replace

Replacement's downtime is realistically 18 to 30 months for a complete unit replacement in an operating plant, through project planning, procurement, engineering, manufacturing, FAT, installation, integration, commissioning, and SAT. And it is an integration project, not an equipment purchase: controls and DCS interfaces, fuel systems, electrical connection, heat recovery, civil works, permitting. The schedule must be priced as such.

It also assumes organisational capability that may need to be rebuilt. A project of this complexity – engineering, procurement, contracts, commissioning, stakeholder management over several years – requires people and processes. An organisation that has not executed a similar project recently should include the cost of that capability in the business case.

So must the bridge. In a district heating plant, bridge operation during an 18–30 month replacement period is not just lost revenue – it may mean purchasing heat from backup sources, running less efficient boilers, or activating emergency supply agreements. Each has a specific cost that must be priced into the replacement comparison from the start, not treated as a scheduling inconvenience.

In our case it was running other units on high load, all the time. It had severe consequences on the heat recovery hot water boilers: heavier maintenance and renewal upgrades after two heating seasons.

Replacement opens a door repair cannot: right-sizing. The original unit was selected for an operating profile that may no longer exist. A different machine – even a nominally smaller one – can be the better asset if its minimum load, efficiency curve, start-up behaviour, and maintenance economics fit how the plant actually runs today. This is where the failure, painful as it is, becomes an asset-strategy opportunity.

Against that: capital cost, the annual cost structure of the new unit, and the fleet question. A new type in a small fleet means new spares, new training, new tooling, and a knowledge curve. That cost is real and appears in no vendor offer.

A new unit will need a new service agreement. Negotiate and sign it at the same time as the EPC contract – the operator's bargaining position weakens significantly once the unit type is fixed and the EPC is awarded. The best commercial conditions are achievable only when both are negotiated together.

The old unit's service agreement also needs to be closed out. If the catastrophic failure is a contractual trigger for termination, that is the simpler path. If not, prepare for a negotiation.

Path 3: Decommission and dismantle

Sometimes the honest answer – if the market suggests so. Price the decommissioning and dismantling, the permanent revenue loss, and above all the supply obligation: can the remaining units and backup sources carry the heat commitment, at what operating cost and penalty exposure? Then price the market position – an operator that cannot deliver when called is remembered by the trader and the dispatcher, and that memory has a price at the next contract negotiation.

The comparison that decides it

Repair Replace Decommission
Typically right when... RCA confirms a random cause, lifecycle position is healthy, unit fits the operating regime Cause is systematic, the type is aged, or the operating profile has changed Supply is coverable elsewhere and the business case is gone
Direct cost Repair quote – a floor; scope grows after stripping; includes removal, transport, workshop, reassembly, re-commissioning Full CAPEX of an integration project – includes old unit removal, civil interface, brownfield EPC, integration Dismantling cost + permanent revenue loss
Downtime Months, scope- and OEM-dependent Realistically 18–30 months Permanent
Hidden costs Owner's internal costs; OEM repricing maintenance risk after the event Bridge operation; organisational capability; fleet integration – spares, training, tooling, knowledge curve Penalty exposure; weakened position at next contract negotiation
Reliability outlook Inherits the failure history; systematic cause = same or increased exposure Reset – backed by new guarantees Risk shifts to the remaining fleet's ability to deliver
Performance At best restored to aged-design level Right-sizing opportunity – match the unit to today's operating profile Capacity lost
Lifecycle position Unchanged: aging type, approaching majors, obsolescence, tightening regulation New lifecycle; modern emissions and regulatory fit Closed
Strategic flexibility Preserves existing options but retains historical constraints Creates new options – operating modes, market participation, performance Removes future optionality
Procurement Simpler, under existing service agreement Heavy – LTSA to be negotiated and signed simultaneously with EPC N/A
Guarantees According to existing service agreement Strongest available under a new service agreement N/A
Insurance Repeated-loss unit: premiums, deductibles, insurability in question Clean position on a new asset – fee might even reduce Claim settlement only

Across all three paths, the deciding figure is not the repair quote versus replacement CAPEX, but the risk-weighted net value over the relevant planning horizon: revenues, total cost of ownership, recoverables, exposure, and strategic flexibility. The horizon matters because identical damage can justify different decisions depending on whether the operator needs the plant for five years or fifteen; likewise, a peaker, a baseload heat supplier, a redundant unit, or a unit carrying a supply obligation all require different weighting.

The decision should be made through a structured, documented evaluation with criteria defined before pricing: failure history in the probability column, contract and insurance positions in the cost columns, and the RCA verdict. Before approval, two safeguards are worth insisting on: have someone competent argue against the preferred option, and run a pre-mortem – assume it is two years later and the decision has failed, then ask whether the causes are already visible today.

A catastrophic failure feels like a repair problem. It is an asset decision. Treat it as one.


This article draws on direct project experience. For a conversation about your specific situation, get in touch.