*** Welcome to piglix ***

MTBF


Mean time between failures (MTBF) is the predicted elapsed time between inherent failures of a system during operation. MTBF can be calculated as the arithmetic mean (average) time between failures of a system. The term is used in both plant and equipment maintenance contexts.

The definition of MTBF depends on the definition of what is considered a system failure. For complex, repairable systems, failures are considered to be those out of design conditions which place the system out of service and into a state for repair. Failures which occur that can be left or maintained in an unrepaired condition, and do not place the system out of service, are not considered failures under this definition. In addition, units that are taken down for routine scheduled maintenance or inventory control are not considered within the definition of failure.

Mean time between failures (MTBF) describes the expected time between two failures for a repairable system, while mean time to failure (MTTF) denotes the expected time to failure for a non-repairable system. For example, three identical systems starting to function properly at time 0 are working until all of them fail. The first system failed at 100 hours, the second failed at 120 hours and the third failed at 130 hours. The MTBF of the system is the average of the three failure times, which is 116.667 hours. If the systems are non-repairable, then their MTTF would be 116.667 hours.

In general, MTBF is the "up-time" between two failure states of a repairable system during operation as outlined here:

Time between failures.svg

For each observation, the "down time" is the instantaneous time it went down, which is after (i.e. greater than) the moment it went up, the "up time". The difference ("down time" minus "up time") is the amount of time it was operating between these two events.

Once the MTBF of a system is known, the probability that any one particular system will be operational at time equal to the MTBF can be calculated. This calculation requires that the system is working within its "useful life period", which is characterized by a relatively constant failure rate (the middle part of the "bathtub curve") when only random failures are occurring.


...
Wikipedia

...