Abstract Model Classes¶
Introduction¶
A model in QInfer is a class that describes the probabilities of observing
data, given a particular experiment and given a particular set of model
parameters. The observation probabilities may be given implicitly or explicitly,
in that the class may only allow for sampling observations, rather than finding
the a distribution explicitly. In the former case, a model is represented by
a subclass of Simulatable
, while in the latter, the model is
represented by a subclass of Model
.
Simulatable
- Base Class for Implicit (Simulatable) Models¶
Class Reference¶
-
class
qinfer.
Simulatable
[source]¶ Bases:
object
Represents a system which can be simulated according to various model parameters and experimental control parameters in order to produce representative data.
See Designing and Using Models for more details.
Parameters: allow_identical_outcomes (bool) – Whether the method outcomes
should be allowed to return multiple identical outcomes for a givenexpparam
. It will be more efficient to set this toTrue
whenever it is likely that multiple identical outcomes will occur.-
n_modelparams
¶ Returns the number of real model parameters admitted by this model.
This property is assumed by inference engines to be constant for the lifetime of a
Model
instance.
-
expparams_dtype
¶ Returns the dtype of an experiment parameter array. For a model with single-parameter control, this will likely be a scalar dtype, such as
"float64"
. More generally, this can be an example of a record type, such as[('time', py.'float64'), ('axis', 'uint8')]
.This property is assumed by inference engines to be constant for the lifetime of a Model instance.
-
is_n_outcomes_constant
¶ Returns
True
if and only if both the domain andn_outcomes
are independent of the expparam.This property is assumed by inference engines to be constant for the lifetime of a Model instance.
-
model_chain
¶ Returns a tuple of models upon which this model is based, such that properties and methods of underlying models for models that decorate other models can be accessed. For a standalone model, this is always the empty tuple.
-
base_model
¶ Returns the most basic model that this model depends on. For standalone models, this property satisfies
model.base_model is model
.
-
underlying_model
¶ Returns the model that this model is based on (decorates) if such a model exists, or
None
if this model is independent.
-
sim_count
¶ Returns the number of data samples that have been produced by this simulator.
Return type: int
-
Q
¶ Returns the diagonal of the scale matrix \(\matr{Q}\) that relates the scales of each of the model parameters. In particular, the quadratic loss for this Model is defined as:
\[L_{\matr{Q}}(\vec{x}, \hat{\vec{x}}) = (\vec{x} - \hat{\vec{x}})^\T \matr{Q} (\vec{x} - \hat{\vec{x}})\]If a subclass does not explicitly define the scale matrix, it is taken to be the identity matrix of appropriate dimension.
Returns: The diagonal elements of \(\matr{Q}\). Return type: ndarray
of shape(n_modelparams, )
.
-
modelparam_names
¶ Returns the names of the various model parameters admitted by this model, formatted as LaTeX strings.
-
are_expparam_dtypes_consistent
(expparams)[source]¶ Returns
True
iff all of the given expparams correspond to outcome domains with the same dtype. For efficiency, concrete subclasses should override this method if the result is alwaysTrue
.Parameters: expparams (np.ndarray) – Array of expparamms of type expparams_dtype
Return type: bool
-
n_outcomes
(expparams)[source]¶ Returns an array of dtype
uint
describing the number of outcomes for each experiment specified byexpparams
. If the number of outcomes does not depend on expparams (i.e.is_n_outcomes_constant
isTrue
), this method should return a single number. If there are an infinite (or intractibly large) number of outcomes, this value specifies the number of outcomes to randomly sample.Parameters: expparams (numpy.ndarray) – Array of experimental parameters. This array must be of dtype agreeing with the expparams_dtype
property.
-
domain
(exparams)[source]¶ Returns a list of
Domain
objects, one for each input expparam.Parameters: expparams (numpy.ndarray) – Array of experimental parameters. This array must be of dtype agreeing with the expparams_dtype
property, or, in the case wheren_outcomes_constant
isTrue
,None
should be a valid input.Return type: list of Domain
-
are_models_valid
(modelparams)[source]¶ Given a shape
(n_models, n_modelparams)
array of model parameters, returns a boolean array of shape(n_models)
specifying whether each set of model parameters represents is valid under this model.
-
simulate_experiment
(modelparams, expparams, repeat=1)[source]¶ Produces data according to the given model parameters and experimental parameters, structured as a NumPy array.
Parameters: - modelparams (np.ndarray) – A shape
(n_models, n_modelparams)
array of model parameter vectors describing the hypotheses under which data should be simulated. - expparams (np.ndarray) – A shape
(n_experiments, )
array of experimental control settings, withdtype
given byexpparams_dtype
, describing the experiments whose outcomes should be simulated. - repeat (int) – How many times the specified experiment should be repeated.
Return type: np.ndarray
Returns: A three-index tensor
data[i, j, k]
, wherei
is the repetition,j
indexes which vector of model parameters was used, and wherek
indexes which experimental parameters where used. Ifrepeat == 1
,len(modelparams) == 1
andlen(expparams) == 1
, then a scalar datum is returned instead.- modelparams (np.ndarray) – A shape
-
clear_cache
()[source]¶ Tells the model to clear any internal caches used in computing likelihoods and drawing samples. Calling this method should not cause any different results, but should only affect performance.
-
experiment_cost
(expparams)[source]¶ Given an array of experimental parameters, returns the cost associated with performing each experiment. By default, this cost is constant (one) for every experiment.
Parameters: expparams ( ndarray
ofdtype
given byexpparams_dtype
) – An array of experimental parameters for which the cost is to be evaluated.Returns: An array of costs corresponding to the specified experiments. Return type: ndarray
ofdtype
float
and of the same shape asexpparams
.
-
distance
(a, b)[source]¶ Gives the distance between two model parameter vectors \(\vec{a}\) and \(\vec{b}\). By default, this is the vector 1-norm of the difference \(\mathbf{Q} (\vec{a} - \vec{b})\) rescaled by
Q
.Parameters: - a (np.ndarray) – Array of model parameter vectors having shape
(n_models, n_modelparams)
. - b (np.ndarray) – Array of model parameters to compare to, having
the same shape as
a
.
Returns: An array
d
of distancesd[i]
betweena[i, :]
andb[i, :]
.- a (np.ndarray) – Array of model parameter vectors having shape
-
update_timestep
(modelparams, expparams)[source]¶ Returns a set of model parameter vectors that is the update of an input set of model parameter vectors, such that the new models are conditioned on a particular experiment having been performed. By default, this is the trivial function \(\vec{x}(t_{k+1}) = \vec{x}(t_k)\).
Parameters: - modelparams (np.ndarray) – Set of model parameter vectors to be updated.
- expparams (np.ndarray) – An experiment parameter array describing the experiment that was just performed.
Return np.ndarray: Array of shape
(n_models, n_modelparams, n_experiments)
describing the update of each model according to each experiment.
-
canonicalize
(modelparams)[source]¶ Returns a canonical set of model parameters corresponding to a given possibly non-canonical set. This is used for models in which there exist model parameters \(\vec{x}_i\) and :math:
vec{x}_j
such that\[\Pr(d | \vec{x}_i; \vec{e}) = \Pr(d | \vec{x}_j; \vec{e})\]for all outcomes \(d\) and experiments \(\vec{e}\). For models admitting such an ambiguity, this method should then be overridden to return a consistent choice out of such vectors, hence avoiding supurious model degeneracies.
Note that, by default,
SMCUpdater
will not call this method.
-
Model
- Base Class for Explicit (Likelihood) Models¶
If a model supports explicit calculation of the likelihood function, then this
is represented by subclassing from Model
.
Class Reference¶
-
class
qinfer.
Model
(allow_identical_outcomes=False, outcome_warning_threshold=0.99)[source]¶ Bases:
qinfer.abstract_model.Simulatable
Represents a system which can be simulated according to various model parameters and experimental control parameters in order to produce the probability of a hypothetical data record. As opposed to
Simulatable
, instances ofModel
not only produce data consistent with the description of a system, but also evaluate the probability of that data arising from the system.Parameters: - allow_identical_outcomes (bool) – Whether the method
representative_outcomes
should be allowed to return multiple identical outcomes for a givenexpparam
. - outcome_warning_threshold (float) – Threshold value below which
representative_outcomes
will issue a warning about the representative outcomes not adequately covering the domain with respect to the relevant distribution.
See Designing and Using Models for more details.
-
call_count
¶ Returns the number of points at which the probability of this model has been evaluated, where a point consists of a hypothesis about the model (a vector of model parameters), an experimental control setting (expparams) and a hypothetical or actual datum. :rtype: int
-
likelihood
(outcomes, modelparams, expparams)[source]¶ Calculates the probability of each given outcome, conditioned on each given model parameter vector and each given experimental control setting.
Parameters: - modelparams (np.ndarray) – A shape
(n_models, n_modelparams)
array of model parameter vectors describing the hypotheses for which the likelihood function is to be calculated. - expparams (np.ndarray) – A shape
(n_experiments, )
array of experimental control settings, withdtype
given byexpparams_dtype
, describing the experiments from which the given outcomes were drawn.
Return type: np.ndarray
Returns: A three-index tensor
L[i, j, k]
, wherei
is the outcome being considered,j
indexes which vector of model parameters was used, and wherek
indexes which experimental parameters where used. Each elementL[i, j, k]
then corresponds to the likelihood \(\Pr(d_i | \vec{x}_j; e_k)\).- modelparams (np.ndarray) – A shape
-
allow_identical_outcomes
¶ Whether the method
representative_outcomes
should be allowed to return multiple identical outcomes for a givenexpparam
. It will be more efficient to set this toTrue
whenever it is likely that multiple identical outcomes will occur.Returns: Flag state. Return type: bool
-
outcome_warning_threshold
¶ Threshold value below which
representative_outcomes
will issue a warning about the representative outcomes not adequately covering the domain with respect to the relevant distribution.Returns: Threshold value. Return type: float
- allow_identical_outcomes (bool) – Whether the method
FiniteOutcomeModel
- Base Class for Models with a Finite Number of Outcomes¶
The likelihood function provided by a subclass is used to implement
Simulatable.simulate_experiment()
, which is possible because the
likelihood of all possible outcomes can be computed.
This class also concretely implements the domain
method
by looking at the definition of n_outcomes
.
Class Reference¶
-
class
qinfer.
FiniteOutcomeModel
(allow_identical_outcomes=False, outcome_warning_threshold=0.99, n_outcomes_cutoff=None)[source]¶ Bases:
qinfer.abstract_model.Model
Represents a system in the same way that
Model
, except that it is demanded that the number of outcomes for any experiment be known and finite.Parameters: - allow_identical_outcomes (bool) – Whether the method
representative_outcomes
should be allowed to return multiple identical outcomes for a givenexpparam
. - outcome_warning_threshold (float) – Threshold value below which
representative_outcomes
will issue a warning about the representative outcomes not adequately covering the domain with respect to the relevant distribution. - n_outcomes_cutoff (int) – If
n_outcomes
exceeds this value,representative_outcomes
will use this value in its place. This is useful in the case of a finite yet untractible number of outcomes. UseNone
for no cutoff.
See
Model
and Designing and Using Models for more details.-
n_outcomes_cutoff
¶ If
n_outcomes
exceeds this value for some expparm,representative_outcomes
will use this value in its place. This is useful in the case of a finite yet untractible number of outcomes.Returns: Cutoff value. Return type: int
-
domain
(expparams)[source]¶ Returns a list of
Domain
objects, one for each input expparam.Parameters: expparams (numpy.ndarray) – Array of experimental parameters. This array must be of dtype agreeing with the expparams_dtype
property, or, in the case wheren_outcomes_constant
isTrue
,None
should be a valid input.Return type: list of Domain
-
static
pr0_to_likelihood_array
(outcomes, pr0)[source]¶ Assuming a two-outcome measurement with probabilities given by the array
pr0
, returns an array of the form expected to be returned bylikelihood
method.Parameters: - outcomes (numpy.ndarray) – Array of integers indexing outcomes.
- pr0 (numpy.ndarray) – Array of shape
(n_models, n_experiments)
describing the probability of obtaining outcome0
from each set of model parameters and experiment parameters.
- allow_identical_outcomes (bool) – Whether the method
DifferentiableModel
- Base Class for Explicit Models with Differentiable Likelihoods¶
Class Reference¶
-
class
qinfer.
DifferentiableModel
(allow_identical_outcomes=False, outcome_warning_threshold=0.99)[source]¶ Bases:
qinfer.abstract_model.Model
-
score
(outcomes, modelparams, expparams, return_L=False)[source]¶ Returns the score of this likelihood function, defined as:
\[q(d, \vec{x}; \vec{e}) = \vec{\nabla}_{\vec{x}} \log \Pr(d | \vec{x}; \vec{e}).\]Calls are represented as a four-index tensor
score[idx_modelparam, idx_outcome, idx_model, idx_experiment]
. The left-most index may be suppressed for single-parameter models.If return_L is True, both
q
and the likelihoodL
are returned asq, L
.
-
fisher_information
(modelparams, expparams)[source]¶ Returns the covariance of the score taken over possible outcomes, known as the Fisher information.
The result is represented as the four-index tensor
fisher[idx_modelparam_i, idx_modelparam_j, idx_model, idx_experiment]
, which gives the Fisher information matrix for each model vector and each experiment vector.Note
The default implementation of this method calls
score()
for each possible outcome, which can be quite slow. If possible, overriding this method can give significant speed advantages.
-