Empirical probability

The empirical probability, relative frequency, or experimental probability of an event is the ratio of the number of outcomes in which a specified event occurs to the total number of trials, not in a theoretical sample space but in an actual experiment. In a more general sense, empirical probability estimates probabilities from experience and observation.

Given an event A in a sample space, the relative frequency of A is the ratio m/n, m being the number of outcomes in which the event A occurs, and n being the total number of outcomes of the experiment.

In statistical terms, the empirical probability is an estimate or estimator of a probability. In simple cases, where the result of a trial only determines whether or not the specified event has occurred, modelling using a binomial distribution might be appropriate and then the empirical estimate is the maximum likelihood estimate. It is the Bayesian estimate for the same case if certain assumptions are made for the prior distribution of the probability. If a trial yields more information, the empirical probability can be improved on by adopting further assumptions in the form of a statistical model: if such a model is fitted, it can be used to derive an estimate of the probability of the specified event.

An advantage of estimating probabilities using empirical probabilities is that this procedure is relatively free of assumptions.

For example, consider estimating the probability among a population of men that they satisfy two conditions:

A direct estimate could be found by counting the number of men who satisfy both conditions to give the empirical probability of the combined condition. An alternative estimate could be found by multiplying the proportion of men who are over 6 feet in height with the proportion of men who prefer strawberry jam to raspberry jam, but this estimate relies on the assumption that the two conditions are statistically independent.

A disadvantage in using empirical probabilities arises in estimating probabilities which are either very close to zero, or very close to one. In these cases very large sample sizes would be needed in order to estimate such probabilities to a good standard of relative accuracy. Here statistical models can help, depending on the context, and in general one can hope that such models would provide improvements in accuracy compared to empirical probabilities, provided that the assumptions involved actually do hold.

...
Wikipedia