Binomial distribution

From Wikipedia, the free encyclopedia

Binomial
Probability mass function
Probability mass function for the binomial distribution
The lines connecting the dots are added for clarity
Cumulative distribution function
Cumulative distribution function for the binomial distribution
Colors match the image above
Parameters n \geq 0 number of trials (integer)
0\leq p \leq 1 success probability (real)
Support k \in \{0,\dots,n\}\!
Probability mass function (pmf) {n\choose k} p^k (1-p)^{n-k} \!
Cumulative distribution function (cdf) I_{1-p}(n-\lfloor k\rfloor, 1+\lfloor k\rfloor) \!
Mean np\!
Median one of \{\lfloor np\rfloor-1, \lfloor np\rfloor, \lfloor np\rfloor+1\}
Mode \lfloor (n+1)\,p\rfloor\!
Variance np(1-p)\!
Skewness \frac{1-2p}{\sqrt{np(1-p)}}\!
Excess kurtosis \frac{1-6p(1-p)}{np(1-p)}\!
Entropy \frac{1}{2} \ln \left( 2 \pi n e p (1-p) \right) + O \left( \frac{1}{n} \right)
Moment-generating function (mgf) (1-p + pe^t)^n \!
Characteristic function (1-p + pe^{it})^n \!

In probability theory and statistics, the binomial distribution is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p. Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial. In fact, when n = 1, then the binomial distribution is the Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.

Contents

An elementary example is this: roll a die ten times and count the number of 1s as outcome. Then this random number follows a binomial distribution with n = 10 and p = 1/6.

For example, assume 5% of the population is green-eyed. You pick 500 people randomly. The number of green-eyed people you pick is a random variable X which follows a binomial distribution with n = 500 and p = 0.05 (when picking the people with replacement).

In general, if the random variable K follows the binomial distribution with parameters n and p, we write K ~ B(n, p). The probability of getting exactly k successes is given by the probability mass function:

f(k;n,p)={n\choose k}p^k(1-p)^{n-k}

for k = 0, 1, 2, ..., n and where

{n\choose k}=\frac{n!}{k!(n-k)!}

is the binomial coefficient (hence the name of the distribution) "n choose k" (also denoted C(n, k) or nCk). The formula can be understood as follows: we want k successes (pk) and nk failures (1 − p)nk. However, the k successes can occur anywhere among the n trials, and there are C(n, k) different ways of distributing k successes in a sequence of n trials.

In creating reference tables for binomial distribution probability, usually the table is filled in up to n/2 values. This is because for k > n/2, the probability can be calculated by its complement as

f(k;n,p)=f(n-k;n,1-p).\,\!

So, one must look to a different k and a different p (the binomial is not symmetrical in general).

The cumulative distribution function can be expressed in terms of the regularized incomplete beta function, as follows:

F(k;n,p) = \Pr(X \le k) = I_{1-p}(n-k, k+1) \!

provided k is an integer and 0 ≤ k ≤ n. If x is not necessarily an integer or not necessarily positive, one can express it thus:

F(x;n,p) = \Pr(X \le x) = \sum_{j=0}^{\operatorname{Floor}(x)} {n\choose j}p^j(1-p)^{n-j}

For knp, upper bounds for the lower tail of the distribution function can be derived. In particular, Hoeffding's inequality yields the bound

F(k;n,p) \leq \exp\left(-2 \frac{(np-k)^2}{n}\right), \!

and Chernoff's inequality can be used to derive the bound

F(k;n,p) \leq \exp\left(-\frac{1}{2\,p} \frac{(np-k)^2}{n}\right). \!

If X ~ B(n, p) (that is, X is a binomially distributed random variable), then the expected value of X is

\operatorname{E}(X)=np\,\!

and the variance is

\operatorname{Var}(X)=np(1-p).\,\!

This fact is easily proven as follows. Suppose first that we have exactly one Bernoulli trial. We have two possible outcomes, 1 and 0, with the first having probability p and the second having probability 1 − p; the mean for this trial is given by μ = p. Using the definition of variance, we have

\sigma^2= \left(1 - p\right)^2p + (0-p)^2(1 - p) = p(1-p).

Now suppose that we want the variance for n such trials (i.e. for the general binomial distribution). Since the trials are independent, we may add the variances for each trial, giving

\sigma^2_n = \sum_{k=1}^n \sigma^2 = np(1 - p). \quad

The most likely value or mode of X is given by the largest integer less than or equal to (n + 1)p; if m = (n + 1)p is itself an integer, then m − 1 and m are both modes.

If X ~ B(n, p) and Y ~ B(m, p) are independent binomial variables, then X + Y is again a binomial variable; its distribution is

X+Y \sim B(n+m, p).\,

Binomial PDF and normal approximation for n = 6 and p = 0.5.
Binomial PDF and normal approximation for n = 6 and p = 0.5.

If n is large enough, the skew of the distribution is not too great, and a suitable continuity correction is used, then an excellent approximation to B(n, p) is given by the normal distribution

\operatorname{N}(np, np(1-p)).\,\!

Various rules of thumb may be used to decide whether n is large enough. One rule is that both np and n(1 − p) must be greater than 5. However, the specific number varies from source to source, and depends on how good an approximation one wants; some sources give 10. Another commonly used rule holds that the above normal approximation is appropriate only if

\mu \pm 3 \sigma = np \pm 3 \sqrt{np(1-p)} \in [0,n].

The following is an example of applying a continuity correction: Suppose one wishes to calculate Pr(X ≤ 8) for a binomial random variable X. If Y has a distribution given by the normal approximation, then Pr(X ≤ 8) is approximated by Pr(Y ≤ 8.5). The addition of 0.5 is the continuity correction. Warning: The normal approximation gives inaccurate results unless a continuity correction is used.

This approximation is a huge time-saver (exact calculations with large n are very onerous); historically, it was the first use of the normal distribution, introduced in Abraham de Moivre's book The Doctrine of Chances in 1733. Nowadays, it can be seen as a consequence of the central limit theorem since B(n, p) is a sum of n independent, identically distributed 0-1 indicator variables.

For example, suppose you randomly sample n people out of a large population and ask them whether they agree with a certain statement. The proportion of people who agree will of course depend on the sample. If you sampled groups of n people repeatedly and truly randomly, the proportions would follow an approximate normal distribution with mean equal to the true proportion p of agreement in the population and with standard deviation σ = (p(1 − p)/n)1/2. Large sample sizes n are good because the standard deviation gets smaller, which allows a more precise estimate of the unknown parameter p.

The binomial distribution converges towards the Poisson distribution as the number of trials goes to infinity while the product np remains fixed. Therefore the Poisson distribution with parameter λ = np can be used as an approximation to B(n, p) of the binomial distribution if n is sufficiently large and p is sufficiently small. According to two rules of thumb, this approximation is good if n ≥ 20 and p ≤ 0.05, or if n ≥ 100 and np ≤ 10.[1]

  • As n approaches ∞ and p approaches 0 while np remains fixed at λ > 0 or at least np approaches λ > 0, then the Binomial(np) distribution approaches the Poisson distribution with expected value λ.
  • As n approaches ∞ while p remains fixed, the distribution of
{X-np \over \sqrt{np(1-p)\ }}
approaches the normal distribution with expected value 0 and variance 1.

  1. ^ NIST/SEMATECH, '6.3.3.1. Counts Control Charts', e-Handbook of Statistical Methods, <http://www.itl.nist.gov/div898/handbook/pmc/section3/pmc331.htm> [accessed 25 October 2006]
  • Abdi, H. "[1] ((2007). Binomial Distribution: Binomial and Sign Tests.. In N.J. Salkind (Ed.): Encyclopedia of Measurement and Statistics. Thousand Oaks (CA): Sage.".

Image:Bvn-small.png Probability distributionsview  talk  edit ]
Univariate Multivariate
Discrete: BenfordBernoullibinomialBoltzmanncategoricalcompound PoissondegenerateGauss-Kuzmingeometrichypergeometriclogarithmicnegative binomialparabolic fractalPoissonRademacherSkellamuniformYule-SimonzetaZipfZipf-Mandelbrot Ewensmultinomialmultivariate Polya
Continuous: BetaBeta primeCauchychi-squareDirac delta functionErlangexponentialexponential powerFfadingFisher's zFisher-TippettGammageneralized extreme valuegeneralized hyperbolicgeneralized inverse GaussianHalf-LogisticHotelling's T-squarehyperbolic secanthyper-exponentialhypoexponentialinverse chi-square (scaled inverse chi-square)• inverse Gaussianinverse gamma (scaled inverse gamma) • KumaraswamyLandauLaplaceLévyLévy skew alpha-stablelogisticlog-normalMaxwell-BoltzmannMaxwell speednormal (Gaussian)normal inverse GaussianParetoPearsonpolarraised cosineRayleighrelativistic Breit-WignerRiceshifted GompertzStudent's ttriangulartype-1 Gumbeltype-2 GumbeluniformVariance-GammaVoigtvon MisesWeibullWigner semicircleWilks' lambda Dirichletinverse-WishartKentmatrix normalmultivariate normalmultivariate Studentvon Mises-FisherWigner quasiWishart
Miscellaneous: Cantorconditionalexponential familyinfinitely divisiblelocation-scale familymarginalmaximum entropyphase-typeposteriorpriorquasisamplingsingular

Advanced Search
Included Web Search Engines


Safe Search

close

Top Matching Results

Occasionally Search.com will highlight specialized results that are based on the context of your query. Examples of specialized results include specific links to news, images, or video.

Top Matching Results may highlight information from other Search.com pages, content from the CNET Network of sites, or third party content. The listings are based purely on relevance. Search.com does not receive payment for listings in this section but our partners that provide this data may get paid for listing these products.

Sponsored Links

This section contains paid listings which have been purchased by companies that want to have their sites appear for specific search terms and related content. These listings are administered, sorted and maintained by a third party and are not endorsed by Search.com.

Search Results

Search.com sends your search query to several search engines at one time and integrates the results into one list which has been sorted by relevance using Search.com's proprietary algorithm. You can customize the list of search engines included in your metasearch from the preferences.

The search engines that are used in your metasearch may allow companies to pay to have their Web sites included within the results. To view the Paid Inclusion policy for a specific search engine, please visit their Web site. Search.com does not accept payment or share revenue with any search engine partner for listings in this section.