Semipredicate problem

In computer programming, a semipredicate problem occurs when a subroutine intended to return a useful value can fail, but the signalling of failure uses an otherwise valid return value. The problem is that the caller of the subroutine cannot tell what the result means in this case.

The division operation yields a real number, but fails when the divisor is zero. If we were to write a function that performs division, we might choose to return 0 on this invalid input. However, if the dividend is 0, the result is 0 too. This means there is no number we can return to uniquely signal attempted division by zero, since all real numbers are in the range of division.

Early programmers dealt with potentially exceptional cases, as in the case of division, using a convention that required the calling routine to check the validity of the inputs before calling the division function. This was undesirable for two reasons. First, it greatly encumbers all code that performs division (a very common operation). Second, it violates the important principle of encapsulation in programming, whereby the handling of one issue should be contained in one place in the code. If we imagine a more complicated computation than division, the caller may not even know that invalid input is being handed to the target function; indeed, figuring out that the input is invalid may be as costly as performing the entire computation.

The semipredicate problem is not universal among functions that can fail.

If the function's range does not cover the entire data type defined for the function, a value known to be impossible under normal computation can be used. For example, consider the function index, which takes a string and a substring, and returns the integer index of the substring in the main string. If the search fails, the function may be programmed to return -32,768 (or any other negative value), since this can never signify a successful result.

This solution has its problems, though; it overloads the natural meaning of a function with an arbitrary convention. First, the programmer must remember specific failure values for many functions, which of course cannot be identical if the functions have different domains. Second, a different implementation of the same function may choose to use a different failure value, resulting in possible bugs when programmers move from environment to environment. Third, if the failing function wishes to communicate useful information about why it had failed, one failure value is insufficient. Fourth, a signed integer halves the possible index range to be able to store the sign bit.

...
Wikipedia