Negative binomial distribution

Different texts adopt slightly different definitions for the negative binomial distribution. They can be distinguished by whether the support starts at k = 0 or at k = r, whether p denotes the probability of a success or of a failure, and whether r represents success or failure, so it is crucial to identify the specific parametrization used in any given text.
Probability mass function The orange line represents the mean, which is equal to 10 in each of these plots; the green line shows the standard deviation.
Notation	$\mathrm {NB} (r,\,p)$
Parameters	r > 0 — number of failures until the experiment is stopped (integer, but the definition can also be extended to reals) p ∈ (0,1) — success probability in each experiment (real)
Support	k ∈ { 0, 1, 2, 3, … } — number of successes
pmf	${k+r-1 \choose k}\cdot (1-p)^{r}p^{k},$ involving a binomial coefficient
CDF	$1-I_{p}(k+1,\,r),$ the regularized incomplete beta function
Mean	${\frac {pr}{1-p}}$
Mode	${\begin{cases}{\big \lfloor }{\frac {p(r-1)}{1-p}}{\big \rfloor }&{\text{if}}\ r>1\\0&{\text{if}}\ r\leq 1\end{cases}}$
Variance	${\frac {pr}{(1-p)^{2}}}$
Skewness	${\frac {1+p}{\sqrt {pr}}}$
Ex. kurtosis	${\frac {6}{r}}+{\frac {(1-p)^{2}}{pr}}$
MGF	${\biggl (}{\frac {1-p}{1-pe^{t}}}{\biggr )}^{\!r}{\text{ for }}t<-\log p$
CF	${\biggl (}{\frac {1-p}{1-pe^{i\,t}}}{\biggr )}^{\!r}{\text{ with }}t\in \mathbb {R}$
PGF	${\biggl (}{\frac {1-p}{1-pz}}{\biggr )}^{\!r}{\text{ for }}\|z\|<{\frac {1}{p}}$
Fisher information	${\frac {r}{p^{2}(1-p)}}$

In probability theory and statistics, the negative binomial distribution is a discrete probability distribution of the number of successes in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of failures (denoted r) occurs. For example, if we define a 1 as failure, all non-1s as successes, and we throw a die repeatedly until the third time 1 appears (r = three failures), then the probability distribution of the number of non-1s that had appeared will be a negative binomial.

The Pascal distribution (after Blaise Pascal) and Polya distribution (for George Pólya) are special cases of the negative binomial. There is a convention among engineers, climatologists, and others to reserve "negative binomial" in a strict sense or "Pascal" for the case of an integer-valued stopping-time parameter r, and use "Polya" for the real-valued case.

For occurrences of "contagious" discrete events, like tornado outbreaks, the Polya distributions can be used to give more accurate models than the Poisson distribution by allowing the mean and variance to be different, unlike the Poisson. "Contagious" events have positively correlated occurrences causing a larger variance than if the occurrences were independent, due to a positive covariance term.

Suppose there is a sequence of independent Bernoulli trials. Thus, each trial has two potential outcomes called "success" and "failure". In each trial the probability of success is p and of failure is (1 − p). We are observing this sequence until a predefined number r of failures has occurred. Then the random number of successes we have seen, X, will have the negative binomial (or Pascal) distribution:

When applied to real-world problems, outcomes of success and failure may or may not be outcomes we ordinarily view as good and bad, respectively. Suppose we used the negative binomial distribution to model the number of days a certain machine works before it breaks down. In this case "success" would be the result on a day when the machine worked properly, whereas a breakdown would be a "failure". If we used the negative binomial distribution to model the number of goal attempts a sportsman makes before scoring r goals, though, then each unsuccessful attempt would be a "success", and scoring a goal would be "failure". If we are tossing a coin, then the negative binomial distribution can give the number of heads ("success") we are likely to encounter before we encounter a certain number of tails ("failure"). In the probability mass function below, p is the probability of success, and (1 − p) is the probability of failure.

...
Wikipedia