Sum of squares (statistics)

The partition of sums of squares is a concept that permeates much of inferential statistics and descriptive statistics. More properly, it is the partitioning of sums of squared deviations or errors. Mathematically, the sum of squared deviations is an unscaled, or unadjusted measure of dispersion (also called variability). When scaled for the number of degrees of freedom, it estimates the variance, or spread of the observations about their mean value. Partitioning of the sum of squared deviations into various components allows the overall variability in a dataset to be ascribed to different types or sources of variability, with the relative importance of each being quantified by the size of each component of the overall sum of squares.

The distance from any point in a collection of data, to the mean of the data, is the deviation. This can be written as $y_{i}-{\overline {y}}$ , where $y_{i}$ is the ith data point, and ${\overline {y}}$ is the estimate of the mean. If all such deviations are squared, then summed, as in $\sum _{i=1}^{n}\left(y_{i}-{\overline {y}}\,\right)^{2}$ , this gives the "sum of squares" for these data.

...
Wikipedia