Interrupt latency

In computing, interrupt latency is the time that elapses from when an interrupt is generated to when the source of the interrupt is serviced. For many operating systems, devices are serviced as soon as the device's interrupt handler is executed. Interrupt latency may be affected by microprocessor design, interrupt controllers, interrupt masking, and the operating system's (OS) interrupt handling methods.

There is usually a trade-off between interrupt latency, throughput, and processor utilization. Many of the techniques of CPU and OS design that improve interrupt latency will decrease throughput and increase processor utilization. Techniques that increase throughput may increase interrupt latency and increase processor utilization. Lastly, trying to reduce processor utilization may increase interrupt latency and decrease throughput.

Minimum interrupt latency is largely determined by the interrupt controller circuit and its configuration. They can also affect the jitter in the interrupt latency, which can drastically affect the real-time schedulability of the system. The Intel APIC Architecture is well known for producing a huge amount of interrupt latency jitter.

Maximum interrupt latency is largely determined by the methods an OS uses for interrupt handling. For example, most processors allow programs to disable interrupts, putting off the execution of interrupt handlers, in order to protect critical sections of code. During the execution of such a critical section, all interrupt handlers that cannot execute safely within a critical section are blocked (they save the minimum amount of information required to restart the interrupt handler after all critical sections have exited). So the interrupt latency for a blocked interrupt is extended to the end of the critical section, plus any interrupts with equal and higher priority that arrived while the block was in place.

...
Wikipedia