Ground truth is a term used in various fields to refer to information provided by direct observation as opposed to information provided by inference.
In machine learning, the term "ground truth" refers to the accuracy of the training set's classification for supervised learning techniques. This is used in statistical models to prove or disprove research hypotheses. The term "ground truthing" refers to the process of gathering the proper objective (provable) data for this test. Compare with gold standard.
Bayesian spam filtering is a common example of supervised learning. In this system, the algorithm is manually taught the differences between spam and non-spam. This depends on the ground truth of the messages used to train the algorithm – inaccuracies in the ground truth will correlate to inaccuracies in the resulting spam/non-spam verdicts.
In remote sensing, "ground truth" refers to information collected on location. Ground truth allows image data to be related to real features and materials on the ground. The collection of ground-truth data enables calibration of remote-sensing data, and aids in the interpretation and analysis of what is being sensed. Examples include cartography, meteorology, analysis of aerial photographs, satellite imagery and other techniques in which data are gathered at a distance.
More specifically, ground truth may refer to a process in which a pixel on a satellite image is compared to what is there in reality (at the present time) in order to verify the contents of the pixel on the image. In the case of a classified image, it allows supervised classification to help determine the accuracy of the classification performed by the remote sensing software and therefore minimize errors in the classification such as errors of commission and errors of omission.