## Gini Coefficient

• The Gini coefficient (sometimes expressed as a Gini ratio or a normalized Gini index) (/ini/ jee-nee) is a measure of statistical dispersion intended to represent the income or wealth distribution of a nation's residents, and is the most commonly used measure of inequality. It was developed by the Italian statistician and sociologist Corrado Gini and published in his 1912 paper Variability and Mutability (Italian: Variabilità e mutabilità).

The Gini coefficient measures the inequality among values of a frequency distribution (for example, levels of income). A Gini coefficient of zero expresses perfect equality, where all values are the same (for example, where everyone has the same income). A Gini coefficient of 1 (or 100%) expresses maximal inequality among values (e.g., for a large number of people, where only one person has all the income or consumption, and all others have none, the Gini coefficient will be very nearly one). However, a value greater than one may occur if some persons represent negative contribution to the total (for example, having negative income or wealth). For larger groups, values close to or above 1 are very unlikely in practice. Given the normalization of both the cumulative population and the cumulative share of income used to calculate the Gini coefficient, the measure is not overly sensitive to the specifics of the income distribution, but rather only on how incomes vary relative to the other members of a population. The exception to this is in the redistribution of wealth resulting in a minimum income for all people. When the population is sorted, if their income distribution were to approximate a well known function, then some representative values could be calculated.

Income Distribution function PDF(x) Gini Coefficient
Dirac delta function ${\displaystyle \delta (x-x_{0}),\,x_{0}>0}$ 0
Uniform distribution ${\displaystyle {\begin{cases}{\frac {1}{b-a}}&a\leq x\leq b\\0&\mathrm {otherwise} \end{cases}}}$ ${\displaystyle {\frac {b-a}{3(b+a)}}}$
Exponential distribution ${\displaystyle \lambda e^{-x\lambda },\,\,x>0}$ ${\displaystyle 1/2}$
Log-normal distribution ${\displaystyle {\frac {1}{\sigma {\sqrt {2\pi }}}}e^{\frac {-(\ln \,(x)-\mu )^{2}}{\sigma ^{2}}}}$ ${\displaystyle {\textrm {erf}}(\sigma /2)}$
Pareto distribution ${\displaystyle {\begin{cases}{\frac {\alpha k^{\alpha }}{x^{\alpha +1}}}&x\geq k\\0&x ${\displaystyle {\begin{cases}1&0<\alpha <1\\{\frac {1}{2\alpha -1}}&\alpha \geq 1\end{cases}}}$
Chi-squared distribution ${\displaystyle {\frac {2^{-k/2}e^{-x/2}x^{k/2-1}}{\Gamma (k/2)}}}$ ${\displaystyle {\frac {2\,\Gamma \left({\frac {1+k}{2}}\right)}{k\,\Gamma (k/2){\sqrt {\pi }}}}}$
Gamma distribution ${\displaystyle {\frac {e^{-x/\theta }x^{k-1}\theta ^{-k}}{\Gamma (k)}}}$ ${\displaystyle {\frac {\Gamma \left({\frac {2k+1}{2}}\right)}{k\,\Gamma (k){\sqrt {\pi }}}}}$
Weibull distribution ${\displaystyle {\frac {k}{\lambda }}\,\left({\frac {x}{\lambda }}\right)^{k-1}e^{-(x/\lambda )^{k}}}$ ${\displaystyle 1-2^{-1/k}}$
Beta distribution ${\displaystyle {\frac {x^{\alpha -1}(1-x)^{\beta -1}}{B(\alpha ,\beta )}}}$ ${\displaystyle \left({\frac {2}{\alpha }}\right){\frac {B(\alpha +\beta ,\alpha +\beta )}{B(\alpha ,\alpha )B(\beta ,\beta )}}}$
Income Gini coefficient
World, 1820–2005
Year World Gini coefficients
1820 0.43
1850 0.53
1870 0.56
1913 0.61
1929 0.62
1950 0.64
1960 0.64
1980 0.66
2002 0.71
2005 0.68
Year World Gini coefficient
1988 .80
1993 .76
1998 .74
2003 .72
2008 .70
2013 .65
Table A. Different income distributions
with the same Gini Index
Household
Group
Country A
Annual
Income ($) Country B Annual Income ($)
1 20,000 9,000
2 30,000 40,000
3 40,000 48,000
4 50,000 48,000
5 60,000 55,000
Total Income $200,000$200,000
Country's Gini 0.2 0.2
Table B. Same income distributions
but different Gini Index
Household
number
Country A
Annual
Income ($) Household combined number Country A combined Annual Income ($)
1 20,000 1 & 2 50,000
2 30,000
3 40,000 3 & 4 90,000
4 50,000
5 60,000 5 & 6 130,000
6 70,000
7 80,000 7 & 8 170,000
8 90,000
9 120,000 9 & 10 270,000
10 150,000
Total Income $710,000$710,000
Country's Gini 0.303 0.293
Table C. Household money income
distributions and Gini Index, USA
Income bracket
% of Population
1979
% of Population
2010
Under $15,000 14.6% 13.7%$15,000 – $24,999 11.9% 12.0%$25,000 – $34,999 12.1% 10.9%$35,000 – $49,999 15.4% 13.9%$50,000 – $74,999 22.1% 17.7%$75,000 – $99,999 12.4% 11.4%$100,000 – $149,999 8.3% 12.1%$150,000 – $199,999 2.0% 4.5%$200,000 and over 1.2% 3.9%
Total Households 80,776,000 118,682,000
United States' Gini
on pre-tax basis
0.404 0.469

${\displaystyle G={\frac {\displaystyle {\sum _{i=1}^{n}\sum _{j=1}^{n}\left|x_{i}-x_{j}\right|}}{\displaystyle {2\sum _{i=1}^{n}\sum _{j=1}^{n}x_{j}}}}={\frac {\displaystyle {\sum _{i=1}^{n}\sum _{j=1}^{n}\left|x_{i}-x_{j}\right|}}{\displaystyle {2n\sum _{i=1}^{n}x_{i}}}}}$
${\displaystyle G={\frac {1}{2\mu }}\int _{-\infty }^{\infty }\int _{-\infty }^{\infty }p(x)p(y)\,|x-y|\,dx\,dy}$
${\displaystyle G={\frac {1}{n}}\left(n+1-2\left({\frac {\sum \limits _{i=1}^{n}\;(n+1-i)y_{i}}{\sum \limits _{i=1}^{n}y_{i}}}\right)\right)}$
This may be simplified to:
${\displaystyle G={\frac {2\Sigma _{i=1}^{n}\;iy_{i}}{n\Sigma _{i=1}^{n}y_{i}}}-{\frac {n+1}{n}}}$
This formula actually applies to any real population, since each person can be assigned his or her own yi.
${\displaystyle G(S)={\frac {1}{n-1}}\left(n+1-2\left({\frac {\Sigma _{i=1}^{n}\;(n+1-i)y_{i}}{\Sigma _{i=1}^{n}y_{i}}}\right)\right)}$
is a consistent estimator of the population Gini coefficient, but is not, in general, unbiased. Like G, G (S) has a simpler form:
${\displaystyle G(S)=1-{\frac {2}{n-1}}\left(n-{\frac {\Sigma _{i=1}^{n}\;iy_{i}}{\Sigma _{i=1}^{n}y_{i}}}\right)}$.
${\displaystyle G={\frac {1}{2\mu }}\sum \limits _{i=1}^{n}\sum \limits _{j=1}^{n}\,f(y_{i})f(y_{j})|y_{i}-y_{j}|}$
where
${\displaystyle \mu =\sum \limits _{i=1}^{n}y_{i}f(y_{i})}$
If the points with nonzero probabilities are indexed in increasing order (yi < yi+1) then:
${\displaystyle G=1-{\frac {\Sigma _{i=1}^{n}\;f(y_{i})(S_{i-1}+S_{i})}{S_{n}}}}$
where
${\displaystyle S_{i}=\Sigma _{j=1}^{i}\;f(y_{j})\,y_{j}\,}$ and ${\displaystyle S_{0}=0\,}$. These formulae are also applicable in the limit as ${\displaystyle n\rightarrow \infty }$.
${\displaystyle B=\int _{0}^{1}L(F)dF.}$
${\displaystyle G=1-{\frac {1}{\mu }}\int _{0}^{\infty }(1-F(y))^{2}dy={\frac {1}{\mu }}\int _{0}^{\infty }F(y)(1-F(y))dy}$
${\displaystyle G={\frac {1}{2\mu }}\int _{0}^{1}\int _{0}^{1}|Q(F_{1})-Q(F_{2})|\,dF_{1}\,dF_{2}.}$
Income Distribution function PDF(x) Gini Coefficient
Dirac delta function ${\displaystyle \delta (x-x_{0}),\,x_{0}>0}$ 0
Uniform distribution ${\displaystyle {\begin{cases}{\frac {1}{b-a}}&a\leq x\leq b\\0&\mathrm {otherwise} \end{cases}}}$ ${\displaystyle {\frac {b-a}{3(b+a)}}}$
Exponential distribution ${\displaystyle \lambda e^{-x\lambda },\,\,x>0}$ ${\displaystyle 1/2}$
Log-normal distribution ${\displaystyle {\frac {1}{\sigma {\sqrt {2\pi }}}}e^{\frac {-(\ln \,(x)-\mu )^{2}}{\sigma ^{2}}}}$ ${\displaystyle {\textrm {erf}}(\sigma /2)}$
Pareto distribution ${\displaystyle {\begin{cases}{\frac {\alpha k^{\alpha }}{x^{\alpha +1}}}&x\geq k\\0&x ${\displaystyle {\begin{cases}1&0<\alpha <1\\{\frac {1}{2\alpha -1}}&\alpha \geq 1\end{cases}}}$
Chi-squared distribution ${\displaystyle {\frac {2^{-k/2}e^{-x/2}x^{k/2-1}}{\Gamma (k/2)}}}$ ${\displaystyle {\frac {2\,\Gamma \left({\frac {1+k}{2}}\right)}{k\,\Gamma (k/2){\sqrt {\pi }}}}}$
Gamma distribution ${\displaystyle {\frac {e^{-x/\theta }x^{k-1}\theta ^{-k}}{\Gamma (k)}}}$ ${\displaystyle {\frac {\Gamma \left({\frac {2k+1}{2}}\right)}{k\,\Gamma (k){\sqrt {\pi }}}}}$
Weibull distribution ${\displaystyle {\frac {k}{\lambda }}\,\left({\frac {x}{\lambda }}\right)^{k-1}e^{-(x/\lambda )^{k}}}$ ${\displaystyle 1-2^{-1/k}}$
Beta distribution ${\displaystyle {\frac {x^{\alpha -1}(1-x)^{\beta -1}}{B(\alpha ,\beta )}}}$ ${\displaystyle \left({\frac {2}{\alpha }}\right){\frac {B(\alpha +\beta ,\alpha +\beta )}{B(\alpha ,\alpha )B(\beta ,\beta )}}}$
${\displaystyle G_{1}=1-\sum _{k=1}^{n}(X_{k}-X_{k-1})(Y_{k}+Y_{k-1})}$
${\displaystyle k=A+\ N(0,s^{2}/y_{k})}$
${\displaystyle G={\frac {N+1}{N-1}}-{\frac {2}{N(N-1)\mu }}(\Sigma _{i=1}^{n}\;P_{i}X_{i})}$
${\displaystyle {\text{Inequality}}=\Sigma _{j}\,p_{j}\,f(r_{j})\,,}$
Different income distributions with the same Gini coefficient
Extreme wealth inequality, yet low income Gini coefficient
Small sample bias – sparsely populated regions more likely to have low Gini coefficient
Gini coefficient is unable to discern the effects of structural changes in populations
Inability to value benefits and income from informal economy affects Gini coefficient accuracy
• For a population uniform on the values yi, i = 1 to n, indexed in non-decreasing order (yiyi+1):
• Xk is the cumulated proportion of the population variable, for k = 0,...,n, with X0 = 0, Xn = 1.
• Yk is the cumulated proportion of the income variable, for k = 0,...,n, with Y0 = 0, Yn = 1.
• Yk should be indexed in non-decreasing order (Yk > Yk – 1)
• Anonymity: it does not matter who the high and low earners are.
• Scale independence: the Gini coefficient does not consider the size of the economy, the way it is measured, or whether it is a rich or poor country on average.
• Population independence: it does not matter how large the population of the country is.
• Transfer principle: if income (less than the difference), is transferred from a rich person to a poor person the resulting distribution is more equal.
Wikipedia