A safety-critical system or life-critical system is a system whose failure or malfunction may result in one (or more) of the following outcomes:
A safety-related system (or sometimes safety-involved system) comprises everything (hardware, software, and human aspects) needed to perform one or more safety functions, in which failure would cause a significant increase in the safety risk for the people and/or environment involved. Safety-related systems are those that do not have full responsibility for controlling hazards such as loss of life, severe injury or severe environmental damage. The malfunction of a safety-involved system would only be that hazardous in conjunction with the failure of other systems or human error. Some safety organizations provide guidance on safety-related systems, for example the Health and Safety Executive (HSE) in the United Kingdom.
Risks of this sort are usually managed with the methods and tools of safety engineering. A safety-critical system is designed to lose less than one life per billion (109) hours of operation. Typical design methods include probabilistic risk assessment, a method that combines failure mode and effects analysis (FMEA) with fault tree analysis. Safety-critical systems are increasingly computer-based.
Several reliability regimes for safety-critical systems exist:
Software engineering for safety-critical systems is particularly difficult. There are three aspects which can be applied to aid the engineering software for life-critical systems. First is process engineering and management. Secondly, selecting the appropriate tools and environment for the system. This allows the system developer to effectively test the system by emulation and observe its effectiveness. Thirdly, address any legal and regulatory requirements, such as FAA requirements for aviation. By setting a standard for which a system is required to be developed under, it forces the designers to stick to the requirements. The avionics industry has succeeded in producing standard methods for producing life-critical avionics software. Similar standards exist for automotive (ISO 26262), Medical (IEC 62304) and nuclear (IEC 61513) industries. The standard approach is to carefully code, inspect, document, test, verify and analyze the system. Another approach is to certify a production system, a compiler, and then generate the system's code from specifications. Another approach uses formal methods to generate proofs that the code meets requirements. All of these approaches improve the software quality in safety-critical systems by testing or eliminating manual steps in the development process, because people make mistakes, and these mistakes are the most common cause of potential life-threatening errors.