*** Welcome to piglix ***

Active redundancy


Active redundancy is a design concept that increases operational availability and that reduces operating cost by automating most critical maintenance actions.

This concept is related to condition-based maintenance and fault reporting.

The initial requirement began with military combat systems during World War I. The approach used for survivability was to install thick armor plate to resist gun fire and install multiple guns.

This became unaffordable and impractical during the Cold War when aircraft and missile systems became common.

The new approach was to build distributed systems that continue to work when components are damaged. This depends upon very crude forms of artificial intelligence that perform reconfiguration by obeying specific rules. An example of this approach is the AN/UYK-43 computer.

Formal design philosophies involving active redundancy are required for critical systems where corrective labor is undesirable or impractical to correct failure during normal operation.

Commercial aircraft are required to have multiple redundant computing systems, hydraulic systems, and propulsion systems so that a single in-flight equipment failure will not cause loss of life.

A more recent outcome of this work is the Internet, which relies on a backbone of routers that provide the ability to automatically re-routre communication without human intervention when failures occur.

Satellites placed into orbit around the earth must include massive active redundancy to ensure operation will continue for a decade or longer despite failures induced by normal failure, radiation-induced failure, and thermal shock.

This strategy now dominates space systems, aircraft, and missile systems.

Maintenance requires three actions, which usually involve down time and high priority labor costs:

Active redundancy eliminates down time and reduces manpower requirements by automating all three actions. This requires some amount of automated artificial intelligence.

N stands for needed equipment. The amount of excess capacity affects overall system reliability by limiting the effects of failure.

For example, if it takes two generators to power a city, then "N+1" would be three generators to allow a single failure. Similarly, "N+2" would be four generators, which would allow one generator to fail while a second generator has already failed.


...
Wikipedia

...