Previous topic

util.gplot

Next topic

util.kernel

This Page

util.grandom

Classes and functions for computing random sets of numbers.

class glimpse.util.grandom.HistogramSampler(data, bins=100, resolution=0.0025000000000000001)

Random number generator based on a modelled distribution.

Given repeated observations of a single random variable, this object first models the probability distribution that governs the variable using a histogram. It then generates new variates according to this distribution.

This sampler trades space for time by approximating the cumulative histogram as a single linear array in memory, where the value of a histogram bin is represented repeatedly according to its magnitude. New variates are generated by sampling uniformly from the indices of this array, and returning the edge value of the corresponding bin. The accuracy of the sampler is governed by both the number of bins in the histogram, and the number of elements in the cumulative distribution (cum-dist) array.

Sample(size=1)

Generate variates according to the modelled distribution.

Parameters:size (int, or tuple of int) – Number of variates to generate.
Returns:Generated variates.
Return type:ndarray
__init__(data, bins=100, resolution=0.0025000000000000001)

Construct a new sampler object.

Parameters:
  • data (1D ndarray) – Observations for a single random variable.
  • bins (positive int) – Number of bins to use when generating the histogram.
  • resolution (float in [0, 1]) – Resolution of each element of the cum-dist array. For example, a resolution of 0.25 means that a histogram bin is represented in the cum-dist array only when its magnitude is at least 25% of the total mass (i.e., that at least 1/4 of the total observations fell in the bin’s interval). In this case, the cum-dist array requires only four elements. On the other hand, a resolution of 0.0025 means that a histogram bin needs to contain only 0.25% of the total mass to be represented in the cum-dist array, which now requires 400 elements.
range_error = 0

Relative drop in range of values between observed and generated variates.