Response time (technology)

In technology, response time is the time a system or functional unit takes to react to a given input.

Response time is the total amount of time it takes to respond to a request for service. That service can be anything from a memory fetch, to a disk IO, to a complex database query, or loading a full web page. Ignoring transmission time for a moment, the response time is the sum of the service time and wait time. The service time is the time it takes to do the work you requested. For a given request the service time varies little as the workload increases – to do X amount of work it always takes X amount of time. The wait time is how long the request had to wait in a queue before being serviced and it varies from zero, when no waiting is required, to a large multiple of the service time, as many requests are already in the queue and have to be serviced first.

With basic queueing theory math you can calculate how the average wait time increases as the device providing the service goes from 0-100% busy. As the device becomes busier, the average wait time increases in a non-linear fashion. The busier the device is, the more dramatic the response time increases will seem as you approach 100% busy; all of that increase is caused by increases in wait time, which is the result of all the requests waiting in queue that have to run first.

Transmission time gets added to response time when your request and the resulting response has to travel over a network and it can be very significant. Transmission time can include propagation delays due to distance (the speed of light is finite), delays due to transmission errors, and data communication bandwidth limits (especially at the last mile) slowing the transmission speed of the request or the reply.

In real-time systems the response time of a task or thread is defined as the time elapsed between the dispatch (time when task is ready to execute) to the time when it finishes its job (one dispatch). Response time is different from WCET which is the maximum time the task would take if it were to execute without interference. It is also different from deadline which is the length of time during which the task's output would be valid in the context of the specific system.

...
Wikipedia