Pollard's rho algorithm is a special-purpose integer factorization algorithm. It was invented by John Pollard in 1975. It is particularly effective for a composite number having a small prime factor.
The ρ algorithm is based on Floyd's cycle-finding algorithm and on the observation that (as in the birthday problem) t random numbers x1, x2, ... , xt in the range [1, n] will contain a repetition with probability P > 0.5 if t > 1.177n1/2. The constant 1.177 comes from the more general result that if P is the probability that t random numbers in the range [1, n] contain a repetition, then P > 1 - exp{ - t2/2n }. Thus P > 0.5 provided 1/2 > exp{ - t2/2n }, or t2 > 2n ln 2, or t > (2ln 2)1/2n1/2 = 1.177n1/2.
The ρ algorithm uses g(x), a polynomial modulo n, as a generator of a pseudo-random sequence. (The most commonly used function is g(x) = (x2 + 1) mod n.) Let's assume n = pq. The algorithm generates the sequence x1 = g(2), x2 = g(g(2)), x3 = g(g(g(2))), and so on. Two different sequences will in effect be running at the same time—the sequence {xk} and the sequence {xk mod p}. Since p < n1/2, the latter sequence is likely to repeat earlier than the former sequence. The repetition of the mod p sequence will be detected by the fact that gcd(xk mod p - xm mod p, n) = p, where k < m. Once a repetition occurs, the sequence will cycle, because each term depends only on the previous one. The name ρ algorithm derives from the similarity in appearance between the Greek letter ρ and the directed graph formed by the values in the sequence and their successors. Once it is cycling, Floyd's cycle-finding algorithm will eventually detect a repetition. The algorithm succeeds whenever the sequence {xk mod p} repeats before the sequence {xk}. The randomizing function g(x) must be a polynomial modulo n, so that it will work both modulo p and modulo n. That is, so that g(x mod p) ≡ g(x) (mod p).
The algorithm takes as its inputs n, the integer to be factored; and , a polynomial computed modulo n. This will ensure that if , and , then . In the original algorithm, , but nowadays it is more common to use . The output is either a non-trivial factor of n, or failure. It performs the following steps: