Hypergeometric Distribution Calculator
Table of contents
What is hypergeometric distribution?Hypergeometric distribution formulaHow to use this hypergeometric distribution calculator?Example of the hypergeometric distributionHypergeometric distribution vs. binomial distributionUse our hypergeometric distribution calculator whenever you need to find the probability (or cumulative probability) of a random variable following the hypergeometric distribution. If you want to learn what the hypergeometric distribution is and what the hypergeometric distribution formula looks like, keep reading!
Besides those essential facts, we also provide you with an example of the hypergeometric distribution and discuss when to use the hypergeometric probability distribution vs. the (more familiar) binomial distribution.
What is hypergeometric distribution?
The hypergeometric probability distribution describes the number of successes (objects with a specified feature, as opposed to objects without this feature) in a sample of fixed size when we know the total number of items and the number of success items (total number of objects with that feature). Importantly, we assume sampling is without replacement — when we choose an item from the population, we cannot select it again.
The hypergeometric distribution turns out to be useful whenever an observed event cannot re-occur, e.g., in various card games, in which the fact that we drew a card implies we will not draw that card again. For example, the hypergeometric distribution appears in Fisher's exact test, which we use to test the difference between two proportions when the sample size is small (<=50). Check out our dedicated Fisher's exact test calculator to discover more.
Note that, although the population's items are divided into two mutually exclusive categories (success/failure), the hypergeometric distribution is not the same as the binomial distribution. See the last section and the binomial distribution calculator for more details.
But first, let's discuss the formula for the hypergeometric distribution.
Hypergeometric distribution formula
Three parameters define the hypergeometric probability distribution:
N
— Total number of items in the population;K
— Number of success items in the population; andn
— Number of drawn items (sample size).
A random variable X
follows the hypergeometric distribution if its probability mass function is given by:
where:
k
— Number of drawn success items.
There are usually binomial coefficients in the hypergeometric distribution formula. With the use of the factorial operator !
, we can rewrite the above equation as:
🔎 See the factorial calculator if you're not sure what the exclamation mark !
means.
The mean and variance of the hypergeometric distribution
For a hypergeometric distribution with parameters N
, K
, n
:
-
The mean of hypergeometric distribution (expected value) is equal to:
n × K / N
-
The variance of hypergeometric distribution is equal to:
n × K × (N - K) × (N - n) / [N² × (N - 1)]
How to use this hypergeometric distribution calculator?
As you can see, there are lots of formulae related to the hypergeometric distribution that are not so trivial to evaluate. Fortunately, there's our hypergeometric distribution calculator! 😁 Let's explain how to use it before we move on to an example of the hypergeometric distribution.
-
Enter the parameters of the hypergeometric distribution you want to consider.
-
Choose what to compute:
P(X = k)
or one of the four types of cumulative probabilities:P(X > k)
,P(X ≥ k)
,P(X < k)
,P(X ≤ k)
. -
Our hypergeometric distribution calculator returns the desired probability.
-
At the very bottom of the calculator, you will find the variance and mean of your hypergeometric distribution shown.
Example of the hypergeometric distribution
As you now know what hypergeometric distribution is, let's have a look at an hypergeometric distribution example.
Imagine a bag of chocolate bars with 12
dark and 36
white chocolate bars. You close your eyes and draw 10
bars without replacement.
-
What is the probability that you have exactly
4
dark chocolate bars?The parameters are:
N = 48
,K = 12
,n = 10
,k = 4
.So we apply the hypergeometric distribution formula and obtain:
P(X = 4) = 12!×36!×10!×38! / (48!×4!×8!×6×30!) ≈ 0.1474
-
What is the probability that you have at least
4
dark chocolate bars?P(X ≥ 4) = P(X = 4) + P(X = 5) + P(X = 6) + P(X = 7) + P(X = 8) + P(X = 9) + P(X = 10) ≈ 0.2023
-
What is the mean of hypergeometric distribution?
10 × 12 / 48 = 2.5
-
What is the variance of hypergeometric distribution?
10 × 12 × (48 - 12) × (48 - 10) / [48² × 47] ≈ 285 / 188 ≈ 1.5160
-
What is the standard deviation of this hypergeometric distribution?
√1.5160 ≈ 1.2313
Hypergeometric distribution vs. binomial distribution
The hypergeometric and binomial distributions both quantify the probability of k
successes in n
trials. However:
- For the hypergeometric distribution, there is sampling with no replacement, so each draw decreases the population. In consequence, after each trial, the probability of success in the next trial changes; and
- For the binomial distribution, there is sampling with replacement, so the probability of success remains the same for every trial.
Tip: If the population size is large and the sample is small (relative to the population size), the hypergeometric distribution gives almost the same results as the binomial distribution.
So we can summarize as follows:
-
Use the hypergeometric distribution if you sample without replacement and the population has few enough elements that a trial changes the probability of the next trial significantly; and
-
Use the binomial distribution for sampling with replacement or for sampling without replacement with a large population and a small sample size.