Failure Probability vs Failure Frequency

Home / Away from a collision course / Failure Probability vs Failure Frequency

Probability of Failure (PoF) is relevant to individual equipment units and is a function of time. Failure frequency is relevant to a population of generic units, and is averaged over a time span. Which level of consideration should we use? It depends on the context and, mainly, on our goals. Let’s drill this option down from rocket science to engineering practice.

rocket launch

There were 85 orbital launches in 2016 (refer this page). Two of them failed. The failure frequency is 2 / 85 ≈ 0.024. But does this tell us if the pictured rocket will reach the orbit? No, it doesn’t.

The failure frequency tells us what is the chance of reaching the orbit in any contemporary rocket, subject to the nowadays technical and managerial environment, if nothing changes. If any of the technical environment parameters vary, the failure chance will be altered. By how much exactly?

The ‘simple’ solution is collecting evidential failure data and averaging it over time span, as above. That is a generic failure frequency of generic orbital rockets valid for the time span of 366 days in 2016. This frequency relates to a large population of units in a certain environment only. For example, year 1976 seen 5 orbital launches and one of them failed (refer here), that was a failure frequency of 0.2, meaning a ten times higher failure frequency than nowadays. We can see that a generic failure frequency is rather a gross, context sensitive statistical generalization of things.

Getting back to the practical question: ‘What is the mission failure chance of the pictured rocket (or pressure vessel, or clothes hanger or any individual object)?’

If we are interested in a particular individual item, we need to model it in particular, in the terms of its intrinsic Probability of Failure (PoF). PoF is a physical quantity of the failure chance, known over time, and ever increasing with time from a small value (‘as built’) to a 100% failure chance, which is sensitive to the individual operation history.

To get some feeling of the PoF, consider this funnel:

funnel with balls

  1. Imagine, we are trying to extract out a single ball from this funnel without as much as looking at it at all
  2. Our chance to take the red ball will be 33% in 1st attempt, 50% in the 2nd and 100% in the last
  3. PoF increases along with the number of attempts, or with time if chances are taken continuously
  4. Say, the red ball is now one in a million chance. How do you feel about finding it in the 1st attempt? How about the 1001st?
  5. When we operate plant equipment, explicit attempts are not even necessary. Contrariwise, our risk control inaction becomes taking chances over time.
  6. In other words, the balls are dropping out from the bottom of the funnel at a certain rate, continuously. This is due to equipment ageing embodied in a form of corrosion, crack growth, or otherwise time dependent damage accumulation.
  7. Moreover, each piece of equipment has its own number of red balls in a funnel, and may even (by design) have a number of different funnels, which are various operational damage mechanisms.

Accordingly, operating actual equipment is taking failure chances continuously. What matters is the ratio of the red balls to green ones, which ratio is the PoF, ever increasing over time. If we know the PoF versus time – this is like having a graph of our winning chance versus the number of various casino gambling attempts. Wouldn’t that be handy?

casino roulette

In terms of an industrial plant integrity management and budget control, the PoF knowledge yields a confident inspection and maintenance plan, which is optimal budgetwise.

Why is the PoF(t) level largely missing in the practice then? Because the academic science mystified the PoF evaluation, using the due course heavy mathematical apparatus, latter aimed to describe any event in the universe from a purely theoretical basis. We are lucky to minimize the effort of building heavy models, as we usually do have a sufficient evidential data from our plant operation.

For example, by inspecting oil and gas equipment, we obtain evidential integrity data of individual equipment items relevant to the current and actual operation environment, which data is inherently probabilistic. This data perfectly matches a PoF(t) assessment requirements, which assessment enables an ultimate integrity knowledge and budget control.

At Quanty, we pioneer the PoF(t) driven, and therefore – Cost of Risk substantiated integrity management, as a confident and lifetime-wise response to the current financially tough times. Our CoRBI® strategy addresses any integrity environment changes automatically, because it is based on actual integrity data, as opposed to generic failure frequencies.

A question: In your opinion, did your plant operation environment change since 1990s (when the generic failure data was collected), with reference to equipment manufacture quality and integrity management?

Share your thoughts