Loading [MathJax]/extensions/tex2jax.js
scipy 1.10.1 Pypi GitHub Homepage
Other Docs

Notes

Consider a box containing M balls:, :math:`n` red and M-n blue. We randomly sample balls from the box, one at a time and without replacement, until we have picked r blue balls. nhypergeom is the distribution of the number of red balls k we have picked.

%(before_notes)s

Notes

The symbols used to denote the shape parameters (M, n, and r) are not universally accepted. See the Examples for a clarification of the definitions used here.

The probability mass function is defined as,

f(k; M, n, r) = \frac{{{k+r-1}\choose{k}}{{M-r-k}\choose{n-k}}}{{M \choose n}}

for k \in [0, n], n \in [0, M], r \in [0, M-n], and the binomial coefficient is:

\binom{n}{k} \equiv \frac{n!}{k! (n - k)!}.

It is equivalent to observing k successes in k+r-1 samples with k+r'th sample being a failure. The former can be modelled as a hypergeometric distribution. The probability of the latter is simply the number of failures remaining M-n-(r-1) divided by the size of the remaining population M-(k+r-1). This relationship can be shown as:

NHG(k;M,n,r) = HG(k;M,n,k+r-1)\frac{(M-n-(r-1))}{(M-(k+r-1))}

where NHG is probability mass function (PMF) of the negative hypergeometric distribution and HG is the PMF of the hypergeometric distribution.

%(after_notes)s

A negative hypergeometric discrete random variable.

See Also

binom
hypergeom
nbinom

Examples

import numpy as np
from scipy.stats import nhypergeom
import matplotlib.pyplot as plt
Suppose we have a collection of 20 animals, of which 7 are dogs. Then if we want to know the probability of finding a given number of dogs (successes) in a sample with exactly 12 animals that aren't dogs (failures), we can initialize a frozen distribution and plot the probability mass function:
M, n, r = [20, 7, 12]
rv = nhypergeom(M, n, r)
x = np.arange(0, n+2)
pmf_dogs = rv.pmf(x)
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x, pmf_dogs, 'bo')
ax.vlines(x, 0, pmf_dogs, lw=2)
ax.set_xlabel('# of dogs in our group with given 12 failures')
ax.set_ylabel('nhypergeom PMF')
plt.show()
Instead of using a frozen distribution we can also use `nhypergeom` methods directly. To for example obtain the probability mass function, use:
prb = nhypergeom.pmf(x, M, n, r)
And to generate random numbers:
R = nhypergeom.rvs(M, n, r, size=10)
To verify the relationship between `hypergeom` and `nhypergeom`, use:
from scipy.stats import hypergeom, nhypergeom
M, n, r = 45, 13, 8
k = 6
nhypergeom.pmf(k, M, n, r)
hypergeom.pmf(k, M, n, k+r-1) * (M - n - (r-1)) / (M - (k+r-1))
See :

Local connectivity graph

Hover to see nodes names; edges to Self not shown, Caped at 50 nodes.

Using a canvas is more power efficient and can get hundred of nodes ; but does not allow hyperlinks; , arrows or text (beyond on hover)

SVG is more flexible but power hungry; and does not scale well to 50 + nodes.

nbinomnbinomhypergeomhypergeombinombinom

All aboves nodes referred to, (or are referred from) current nodes; Edges from Self to other have been omitted (or all nodes would be connected to the central node "self" which is not useful). Nodes are colored by the library they belong to, and scaled with the number of references pointing them


GitHub : /scipy/stats/_discrete_distns.py#605
type: <class 'type'>
Commit: