Probability distributions and corrections (utils.stats)

class orangecontrib.bio.utils.stats.Binomial(max=1000)

Binomial distribution is a discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p.

__call__(k, N, m, n)

If m out of N experiments are positive return the probability that k out of n experiments are positive using the binomial distribution: if p = m/N then return bin(n,k)*(p**k + (1-p)**(n-k)) where bin is the binomial coefficient.

p_value(k, N, m, n)

The probability that k or more tests are positive.

class orangecontrib.bio.utils.stats.Hypergeometric(max=1000)

Hypergeometric distribution is a discrete probability distribution that describes the number of successes in a sequence of n draws from a finite population without replacement.

__call__(k, N, m, n)

If m out of N experiments are positive return the probability that k out of n experiments are positive using the hypergeometric distribution (i.e. return bin(m, k)*bin(N-m, n-k)/bin(N,n) where bin is the binomial coefficient).

p_value(k, N, m, n)

The probability that k or more tests are positive.

orangecontrib.bio.utils.stats.FDR(p_values, dependent=False, m=None, ordered=False)

False Discovery Rate correction on a list of p-values.

Parameters:
  • p_values – a list of p-values.
  • dependent – use correction for dependent hypotheses (default False).
  • m – number of hypotheses tested (default len(p_values)).
  • ordered – prevent sorting of p-values if they are already sorted (default False).
orangecontrib.bio.utils.stats.Bonferroni(p_values, m=None)

Bonferroni correction correction on a list of p-values.

Parameters:
  • p_values – a list of p-values.
  • m – number of hypotheses tested (default len(p_values)).