*** Welcome to piglix ***

Data binning


Data binning or bucketing is a data pre-processing technique used to reduce the effects of minor observation errors. The original data values which fall in a given small interval, a bin, are replaced by a value representative of that interval, often the central value. It is a form of quantization.

Statistical data binning is a way to group a number of more or less continuous values into a smaller number of "bins". For example, if you have data about a group of people, you might want to arrange their ages into a smaller number of age intervals. It can also be used in multivariate statistics, binning in several dimensions at once.

In the context of image processing, binning is the procedure of combining a cluster of pixels into a single pixel. As such, in 2x2 binning, an array of 4 pixels becomes a single larger pixel, reducing the overall number of pixels.

This aggregation, reducing the number of data (with a loss of information), facilitates the analysis. For instance, binning the data may also reduce the impact of read noise on the processed image (at the cost of a lower resolution).

Histograms are an example of data binning used in order to observe underlying distributions. They typically occur in one-dimensional space and in equal intervals for ease of visualization.

Data binning may be used when small instrumental shifts in the spectral dimension from MS or NMR experiments will be falsely interpreted as representing different components, when a collection of data profiles is subjected to pattern recognition analysis. A straightforward way to cope with this problem is by using binning techniques in which the spectrum is reduced in resolution to a sufficient degree to ensure that a given peak remains in its bin despite small spectral shifts between analyses. For example, in NMR the chemical shift axis may be discretized and coarsely binned, and in MS the spectral accuracies may be rounded to integer atomic mass unit values.


...
Wikipedia

...