Divergence (statistics)

In statistics and information geometry, divergence or a contrast function is a function which establishes the "distance" of one probability distribution to the other on a statistical manifold. The divergence is a weaker notion than that of the distance, in particular the divergence need not be symmetric (that is, in general the divergence from p to q is not equal to the divergence from q to p), and need not satisfy the triangle inequality.

Suppose S is a space of all probability distributions with common support. Then a divergence on S is a function D(· || ·): S×SR satisfying

The dual divergence D* is defined as

Many properties of divergences can be derived if we restrict S to be a statistical manifold, meaning that it can be parametrized with a finite-dimensional coordinate system θ, so that for a distribution pS we can write p = p(θ).

For a pair of points p, qS with coordinates θp and θq, denote the partial derivatives of D(p || q) as

Now we restrict these functions to a diagonal p = q, and denote

By definition, the function D(p || q) is minimized at p = q, and therefore

where matrix g(D) is positive semi-definite and defines a unique Riemannian metric on the manifold S.

Divergence D(· || ·) also defines a unique torsion-free affine connection(D) with coefficients

and the dual to this connection ∇* is generated by the dual divergence D*.

