Sequential Monte Carlo¶
SMCUpdater
- SMC-Based Particle Updater¶
Class Reference¶
-
class
qinfer.
SMCUpdater
(model, n_particles, prior, resample_a=None, resampler=None, resample_thresh=0.5, debug_resampling=False, track_resampling_divergence=False, zero_weight_policy=u'error', zero_weight_thresh=None, canonicalize=True)[source]¶ Bases:
qinfer.distributions.Distribution
Creates a new Sequential Monte carlo updater, using the algorithm of [GFWC12].
Parameters: - model (Model) – Model whose parameters are to be inferred.
- n_particles (int) – The number of particles to be used in the particle approximation.
- prior (Distribution) – A representation of the prior distribution.
- resampler (callable) – Specifies the resampling algorithm to be used. See Resampling Algorithms for more details.
- resample_thresh (float) – Specifies the threshold for \(N_{\text{ess}}\) to decide when to resample.
- debug_resampling (bool) – If
True
, debug information will be generated on resampling performance, and will be written to the standard Python logger. - track_resampling_divergence (bool) – If true, then the divergences
between the pre- and post-resampling distributions are tracked and
recorded in the
resampling_divergences
attribute. - zero_weight_policy (str) – Specifies the action to be taken when the
particle weights would all be set to zero by an update.
One of
["ignore", "skip", "warn", "error", "reset"]
. - zero_weight_thresh (float) – Value to be used when testing for the zero-weight condition.
- canonicalize (bool) – If
True
, particle locations will be updated to canonical locations as described by the model class after each prior sampling and resampling.
-
n_particles
¶ Returns the number of particles currently used in the sequential Monte Carlo approximation.
Type: int
-
resample_count
¶ Returns the number of times that the updater has resampled the particle approximation.
Type: int
-
just_resampled
¶ True
if and only if there has been no data added since the last resampling, or if there has not yet been a resampling step.Type: bool
-
log_total_likelihood
¶ Returns the log-likelihood of all the data collected so far.
Equivalent to:
np.sum(np.log(updater.normalization_record))
Type: float
-
n_ess
¶ Estimates the effective sample size (ESS) of the current distribution over model parameters.
Type: float
Returns: The effective sample size, given by \(1/\sum_i w_i^2\).
-
min_n_ess
¶ Returns the smallest effective sample size (ESS) observed in the history of this updater.
Type: float
Returns: The minimum of observed effective sample sizes as reported by n_ess
.
-
resampling_divergences
¶ List of KL divergences between the pre- and post-resampling distributions, if that is being tracked. Otherwise,
None
.Type: list
offloat
orNone
-
reset
(n_particles=None, only_params=None, reset_weights=True)[source]¶ Causes all particle locations and weights to be drawn fresh from the initial prior.
Parameters:
-
hypothetical_update
(outcomes, expparams, return_likelihood=False, return_normalization=False)[source]¶ Produces the particle weights for the posterior of a hypothetical experiment.
Parameters: - outcomes (int or an ndarray of dtype int.) – Integer index of the outcome of the hypothetical experiment.
- expparams (numpy.ndarray) – Experiments to be used for the hypothetical updates.
- weights (ndarray, shape (n_outcomes, n_expparams, n_particles)) – Weights assigned to each particle in the posterior distribution \(\Pr(\omega | d)\).
-
update
(outcome, expparams, check_for_resample=True)[source]¶ Given an experiment and an outcome of that experiment, updates the posterior distribution to reflect knowledge of that experiment.
After updating, resamples the posterior distribution if necessary.
Parameters: - outcome (int) – Label for the outcome that was observed, as defined
by the
Model
instance under study. - expparams (
ndarray
of dtype given by theexpparams_dtype
property of the underlying model) – Parameters describing the experiment that was performed. - check_for_resample (bool) – If
True
, after performing the update, the effective sample size condition will be checked and a resampling step may be performed.
- outcome (int) – Label for the outcome that was observed, as defined
by the
-
batch_update
(outcomes, expparams, resample_interval=5)[source]¶ Updates based on a batch of outcomes and experiments, rather than just one.
Parameters: - outcomes (numpy.ndarray) – An array of outcomes of the experiments that were performed.
- expparams (numpy.ndarray) – Either a scalar or record single-index array of experiments that were performed.
- resample_interval (int) – Controls how often to check whether \(N_{\text{ess}}\) falls below the resample threshold.
-
n_rvs
¶ The number of random variables described by the posterior distribution.
-
sample
(n=1)[source]¶ Returns samples from the current posterior distribution.
Parameters: n (int) – The number of samples to draw. Returns: The sampled model parameter vectors. Return type: ndarray
of shape(n, updater.n_rvs)
.
-
est_mean
()[source]¶ Returns an estimate of the posterior mean model, given by the expectation value over the current SMC approximation of the posterior model distribution.
Return type: numpy.ndarray
, shape(n_modelparams,)
.Returns: An array containing the an estimate of the mean model vector.
-
est_meanfn
(fn)[source]¶ Returns an estimate of the expectation value of a given function \(f\) of the model parameters, given by a sum over the current SMC approximation of the posterior distribution over models.
Here, \(f\) is represented by a function
fn
that is vectorized over particles, such thatf(modelparams)
has shape(n_particles, k)
, wheren_particles = modelparams.shape[0]
, and wherek
is a positive integer.Parameters: fn (callable) – Function implementing \(f\) in a vectorized manner. (See above.) Return type: numpy.ndarray
, shape(k, )
.Returns: An array containing the an estimate of the mean of \(f\).
-
est_covariance_mtx
(corr=False)[source]¶ Returns an estimate of the covariance of the current posterior model distribution, given by the covariance of the current SMC approximation.
Parameters: corr (bool) – If True
, the covariance matrix is normalized by the outer product of the square root diagonal of the covariance matrix such that the correlation matrix is returned instead.Return type: numpy.ndarray
, shape(n_modelparams, n_modelparams)
.Returns: An array containing the estimated covariance matrix.
-
bayes_risk
(expparams)[source]¶ Calculates the Bayes risk for a hypothetical experiment, assuming the quadratic loss function defined by the current model’s scale matrix (see
qinfer.abstract_model.Simulatable.Q
).Parameters: expparams ( ndarray
of dtype given by the current model’sexpparams_dtype
property, and of shape(1,)
) – The experiment at which to compute the Bayes risk.Return float: The Bayes risk for the current posterior distribution of the hypothetical experiment expparams
.
-
expected_information_gain
(expparams)[source]¶ Calculates the expected information gain for a hypothetical experiment.
Parameters: expparams ( ndarray
of dtype given by the current model’sexpparams_dtype
property, and of shape(1,)
) – The experiment at which to compute expected information gain.Return float: The Bayes risk for the current posterior distribution of the hypothetical experiment expparams
.
-
est_entropy
()[source]¶ Estimates the entropy of the current posterior as \(-\sum_i w_i \log w_i\) where \(\{w_i\}\) is the set of particles with nonzero weight.
-
est_kl_divergence
(other, kernel=None, delta=0.01)[source]¶ Finds the KL divergence between this and another SMC-approximated distribution by using a kernel density estimator to smooth over the other distribution’s particles.
Parameters: other (SMCUpdater) –
-
est_cluster_metric
(cluster_opts=None)[source]¶ Returns an estimate of how much of the variance in the current posterior can be explained by a separation between clusters.
-
est_credible_region
(level=0.95, return_outside=False, modelparam_slice=None)[source]¶ Returns an array containing particles inside a credible region of a given level, such that the described region has probability mass no less than the desired level.
Particles in the returned region are selected by including the highest- weight particles first until the desired credibility level is reached.
Parameters: Return type: numpy.ndarray
, shape(n_credible, n_mps)
, wheren_credible
is the number of particles in the credible region andn_mps
corresponds to the size ofmodelparam_slice
.If
return_outside
isTrue
, this method instead returns tuple(inside, outside)
whereinside
is as described above, andoutside
has shape(n_particles-n_credible, n_mps)
.Returns: An array of particles inside the estimated credible region. Or, if
return_outside
isTrue
, both the particles inside and the particles outside, as a tuple.
-
region_est_hull
(level=0.95, modelparam_slice=None)[source]¶ Estimates a credible region over models by taking the convex hull of a credible subset of particles.
Parameters: - level (float) – The desired crediblity level (see
SMCUpdater.est_credible_region()
). - modelparam_slice (slice) – Slice over which model parameters to consider.
Returns: The tuple
(faces, vertices)
wherefaces
describes all the vertices of all of the faces on the exterior of the convex hull, andvertices
is a list of all vertices on the exterior of the convex hull.Return type: faces
is anumpy.ndarray
with shape(n_face, n_mps, n_mps)
and indeces(idx_face, idx_vertex, idx_mps)
wheren_mps
corresponds to the size ofmodelparam_slice
.vertices
is annumpy.ndarray
of shape(n_vertices, n_mps)
.- level (float) – The desired crediblity level (see
-
region_est_ellipsoid
(level=0.95, tol=0.0001, modelparam_slice=None)[source]¶ Estimates a credible region over models by finding the minimum volume enclosing ellipse (MVEE) of a credible subset of particles.
Parameters: - level (float) – The desired crediblity level (see
SMCUpdater.est_credible_region()
). - tol (float) – The allowed error tolerance in the MVEE optimization
(see
mvee()
). - modelparam_slice (slice) – Slice over which model parameters to consider.
Returns: A tuple
(A, c)
whereA
is the covariance matrix of the ellipsoid andc
is the center. A point \(\vec{x}\) is in the ellipsoid whenever \((\vec{x}-\vec{c})^{T}A^{-1}(\vec{x}-\vec{c})\leq 1\).Return type: A
isnp.ndarray
of shape(n_mps,n_mps)
andcentroid
isnp.ndarray
of shape(n_mps)
.n_mps
corresponds to the size ofparam_slice
.- level (float) – The desired crediblity level (see
-
in_credible_region
(points, level=0.95, modelparam_slice=None, method=u'hpd-hull', tol=0.0001)[source]¶ Decides whether each of the points lie within a credible region of the current distribution.
If
tol
isNone
, the particles are tested directly against the convex hull object. Iftol
is a positivefloat
, particles are tested to be in the interior of the smallest enclosing ellipsoid of this convex hull, seeSMCUpdater.region_est_ellipsoid()
.Parameters: - points (np.ndarray) – An
np.ndarray
of shape(n_mps)
for a single point, or of shape(n_points, n_mps)
for multiple points, wheren_mps
corresponds to the same dimensionality asparam_slice
. - level (float) – The desired crediblity level (see
SMCUpdater.est_credible_region()
). - method (str) – A string specifying which credible region estimator to
use. One of
'pce'
,'hpd-hull'
or'hpd-mvee'
(see below). - tol (float) – The allowed error tolerance for those methods
which require a tolerance (see
mvee()
). - modelparam_slice (slice) – A slice describing which model parameters to consider in the credible region, effectively marginizing out the remaining parameters. By default, all model parameters are included.
Returns: A boolean array of shape
(n_points, )
specifying whether each of the points lies inside the confidence region.The following values are valid for the
method
argument.'pce'
: Posterior Covariance Ellipsoid.Computes the covariance matrix of the particle distribution marginalized over the excluded slices and uses the \(\chi^2\) distribution to determine how to rescale it such the the corresponding ellipsoid has the correct size. The ellipsoid is translated by the mean of the particle distribution. It is determined which of the
points
are on the interior.
'hpd-hull'
: High Posterior Density Convex Hull.See
SMCUpdater.region_est_hull()
. Computes the HPD region resulting from the particle approximation, computes the convex hull of this, and it is determined which of thepoints
are on the interior.
'hpd-mvee'
: High Posterior Density Minimum Volume Enclosing Ellipsoid.See
SMCUpdater.region_est_ellipsoid()
andmvee()
. Computes the HPD region resulting from the particle approximation, computes the convex hull of this, and determines the minimum enclosing ellipsoid. Deterimines which of thepoints
are on the interior.
- points (np.ndarray) – An
-
posterior_marginal
(idx_param=0, res=100, smoothing=0, range_min=None, range_max=None)[source]¶ Returns an estimate of the marginal distribution of a given model parameter, based on taking the derivative of the interpolated cdf.
Parameters: - idx_param (int) – Index of parameter to be marginalized.
- res1 (int) – Resolution of of the axis.
- smoothing (float) – Standard deviation of the Gaussian kernel used to smooth; same units as parameter.
- range_min (float) – Minimum range of the output axis.
- range_max (float) – Maximum range of the output axis.
See also
-
plot_posterior_marginal
(idx_param=0, res=100, smoothing=0, range_min=None, range_max=None, label_xaxis=True, other_plot_args={}, true_model=None)[source]¶ Plots a marginal of the requested parameter.
Parameters: - idx_param (int) – Index of parameter to be marginalized.
- res1 (int) – Resolution of of the axis.
- smoothing (float) – Standard deviation of the Gaussian kernel used to smooth; same units as parameter.
- range_min (float) – Minimum range of the output axis.
- range_max (float) – Maximum range of the output axis.
- label_xaxis (bool) – Labels the \(x\)-axis with the model parameter name given by this updater’s model.
- other_plot_args (dict) – Keyword arguments to be passed to
matplotlib’s
plot
function. - true_model (np.ndarray) – Plots a given model parameter vector as the “true” model for comparison.
See also
-
plot_covariance
(corr=False, param_slice=None, tick_labels=None, tick_params=None)[source]¶ Plots the covariance matrix of the posterior as a Hinton diagram.
Note
This function requires that mpltools is installed.
Parameters: - corr (bool) – If
True
, the covariance matrix is first normalized by the outer product of the square root diagonal of the covariance matrix such that the correlation matrix is plotted instead. - param_slice (slice) – Slice of the modelparameters to be plotted.
- tick_labels (list) – List of tick labels for each component; by default, these are drawn from the model itself.
- corr (bool) – If
-
posterior_mesh
(idx_param1=0, idx_param2=1, res1=100, res2=100, smoothing=0.01)[source]¶ Returns a mesh, useful for plotting, of kernel density estimation of a 2D projection of the current posterior distribution.
Parameters: - idx_param1 (int) – Parameter to be treated as \(x\) when plotting.
- idx_param2 (int) – Parameter to be treated as \(y\) when plotting.
- res1 (int) – Resolution along the \(x\) direction.
- res2 (int) – Resolution along the \(y\) direction.
- smoothing (float) – Standard deviation of the Gaussian kernel used to smooth the particle approximation to the current posterior.
See also
-
plot_posterior_contour
(idx_param1=0, idx_param2=1, res1=100, res2=100, smoothing=0.01)[source]¶ Plots a contour of the kernel density estimation of a 2D projection of the current posterior distribution.
Parameters: - idx_param1 (int) – Parameter to be treated as \(x\) when plotting.
- idx_param2 (int) – Parameter to be treated as \(y\) when plotting.
- res1 (int) – Resolution along the \(x\) direction.
- res2 (int) – Resolution along the \(y\) direction.
- smoothing (float) – Standard deviation of the Gaussian kernel used to smooth the particle approximation to the current posterior.
See also