Built-in objects¶
Optimizers¶
-
class
hessianfree.optimizers.Optimizer[source]¶ Bases:
objectBase class for optimizers.
Each optimizer has a
self.netparameter that will be set automatically when the optimizer is added to a network (referring to that network).-
compute_update(printing=False)[source]¶ Compute a weight update for the current batch.
It can be assumed that the batch has already been stored in
net.inputsandnet.targets, and the nonlinearity activations/derivatives for the batch are cached innet.activationsandnet.d_activations.Parameters: printing (bool) – if True, print out data about the optimization
-
-
class
hessianfree.optimizers.HessianFree(CG_iter=250, init_damping=1, plotting=True)[source]¶ Bases:
hessianfree.optimizers.OptimizerUse Hessian-free optimization to compute the weight update.
Parameters: - CG_iter (int) – maximum number of CG iterations to run per epoch
- init_damping (float) – the initial value of the Tikhonov damping
- plotting (bool) – if True then collect data for plotting (actual plotting handled in parent network)
-
class
hessianfree.optimizers.SGD(l_rate=1, plotting=False)[source]¶ Bases:
hessianfree.optimizers.OptimizerCompute weight update using first-order gradient descent.
Parameters: - l_rate – learning rate to apply to weight updates
- plotting – if True then collect data for plotting (actual plotting handled in parent network)
Nonlinearities¶
-
class
hessianfree.nonlinearities.Nonlinearity(stateful=False)[source]¶ Bases:
objectBase class for layer nonlinearities.
Parameters: stateful (boolean) – True if this nonlinearity has internal state (in which case it needs to return d_input,d_state, andd_outputind_activation(); seeContinuousfor an example)-
activation(x)[source]¶ Apply the nonlinearity to the inputs.
Parameters: x – input to the nonlinearity
-
-
class
hessianfree.nonlinearities.Tanh[source]¶ Bases:
hessianfree.nonlinearities.NonlinearityHyperbolic tangent function
\(f(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}\)
-
class
hessianfree.nonlinearities.Logistic[source]¶ Bases:
hessianfree.nonlinearities.NonlinearityLogistic sigmoid function
\(f(x) = \frac{1}{1 + e^{-x}}\)
Note: if scipy is installed then this will use the slightly faster
scipy.special.expit
-
class
hessianfree.nonlinearities.Linear[source]¶ Bases:
hessianfree.nonlinearities.NonlinearityLinear activation function (passes inputs unchanged).
\(f(x) = x\)
-
class
hessianfree.nonlinearities.ReLU(max=10000000000.0)[source]¶ Bases:
hessianfree.nonlinearities.NonlinearityRectified linear unit
\(f(x) = max(x, 0)\)
Parameters: max – an upper bound on activation to help avoid numerical errors
-
class
hessianfree.nonlinearities.Gaussian[source]¶ Bases:
hessianfree.nonlinearities.NonlinearityGaussian activation function
\(f(x) = e^{-x^2}\)
-
class
hessianfree.nonlinearities.Softmax[source]¶ Bases:
hessianfree.nonlinearities.NonlinearitySoftmax activation function
\(f(x_i) = \frac{e^{x_i}}{\sum_j{e^{x_j}}}\)
-
class
hessianfree.nonlinearities.SoftLIF(sigma=1, tau_rc=0.02, tau_ref=0.002, amp=0.01)[source]¶ Bases:
hessianfree.nonlinearities.NonlinearitySoftLIF activation function
Based on Hunsberger, E. and Eliasmith, C. (2015). Spiking deep networks with LIF neurons. arXiv:1510.08829.
\[f(x) = \frac{amp}{ \tau_{ref} + \tau_{RC} log(1 + \frac{1}{\sigma log(1 + e^{\frac{x}{\sigma}})})}\]Note: this is equivalent to \(LIF(SoftReLU(x))\)
Parameters: - sigma (float) – controls the smoothness of the nonlinearity threshold
- tau_rc (float) – LIF RC time constant
- tau_ref (float) – LIF refractory time constant
- amp (float) – scales output of nonlinearity
-
class
hessianfree.nonlinearities.Continuous(base, tau=1.0, dt=1.0)[source]¶ Bases:
hessianfree.nonlinearities.NonlinearityCreates a version of the base nonlinearity that operates in continuous time (filtering inputs with the given tau/dt).
\[\frac{ds}{dt} = \frac{x - s}{\tau}\]\[f(x) = base(s)\]Parameters: - base (
Nonlinearity) – nonlinear output function applied to the continuous state - tau (float) – time constant of input filter (higher value means the internal state changes more slowly)
- dt (float) – simulation time step
- base (
-
class
hessianfree.nonlinearities.Plant(stateful=True)[source]¶ Bases:
hessianfree.nonlinearities.NonlinearityBase class for a plant that can be called to dynamically generate inputs for a network.
See
demos.plant()for an example of this being used in practice.-
__call__(x)[source]¶ Update the internal state of the plant based on input.
Parameters: x – the output of the last layer in the network on the previous timestep
-
get_vecs()[source]¶ Return a tuple of the (inputs,targets) vectors generated by the plant since the last reset.
-
reset(init=None)[source]¶ Reset the plant to initial state.
Parameters: init – override the default initial state with these values
-
Loss functions¶
-
class
hessianfree.loss_funcs.LossFunction[source]¶ Bases:
objectDefines a loss function that maps nonlinearity activations to error.
-
loss(activities, targets)[source]¶ Computes the loss for each unit in the network.
Note that most loss functions are only based on the output of the final layer, activities[-1]. However, we pass the activities of all layers here so that loss functions can include things like sparsity constraints. Targets, however, are only defined for the output layer.
Targets can be defined as
np.nan, which will be translated into zero error.Parameters: - activities (list) – output activations of each layer
- targets (
ndarray) – target activation values for last layer
-
d_loss(activities, targets)[source]¶ First derivative of loss function (with respect to activities).
-
-
hessianfree.loss_funcs.output_loss(func)[source]¶ Convenience decorator that takes a loss defined for the output layer and converts it into the more general form in terms of all layers.
-
class
hessianfree.loss_funcs.SquaredError[source]¶ Bases:
hessianfree.loss_funcs.LossFunctionSquared error
\(\frac{1}{2} \sum(output - target)^2\)
-
class
hessianfree.loss_funcs.CrossEntropy[source]¶ Bases:
hessianfree.loss_funcs.LossFunctionCross-entropy error
\(-\sum(target * log(output))\)
-
class
hessianfree.loss_funcs.ClassificationError[source]¶ Bases:
hessianfree.loss_funcs.LossFunctionClassification error
\(argmax(output) \neq argmax(target)\)
Note:
d_lossandd2_lossare not defined; classification error should only be used for validation, which doesn’t require either.
-
class
hessianfree.loss_funcs.StructuralDamping(weight, layers=None, optimizer=None)[source]¶ Bases:
hessianfree.loss_funcs.LossFunctionApplies structural damping, which penalizes layers for having highly variable output activity.
Note: this is not exactly the same as the structural damping in Martens (2010), because it is applied on the output side of the nonlinearity (meaning that this error will be filtered through
d_activationsduring the backwards propagation).Parameters: - weight (float) – scale on structural damping relative to other losses
- layers (list) – indices specifying which layers will have the damping applied (defaults to all except first/last layers)
- optimizer (
Optimizer) – if provided, the weight on structural damping will be scaled relative to thedampingattribute in the optimizer (so that any processes dynamically adjusting the damping during the optimization will also affect the structural damping)
-
class
hessianfree.loss_funcs.SparseL1(weight, layers=None, target=0.0)[source]¶ Bases:
hessianfree.loss_funcs.LossFunctionImposes L1 sparsity constraint on nonlinearity activations.
Parameters: - weight (float) – relative weight of sparsity constraint
- layers (list) – indices specifying which layers will have the sparsity constraint applied (defaults to all except first/last layers)
- target (float) – target activation level for nonlinearities
-
class
hessianfree.loss_funcs.SparseL2(weight, layers=None, target=0.0)[source]¶ Bases:
hessianfree.loss_funcs.LossFunctionImposes L2 sparsity constraint on nonlinearity activations.
Parameters: - weight (float) – relative weight of sparsity constraint
- layers (list) – indices specifying which layers will have the sparsity constraint applied (defaults to all except first/last layers)
- target (float) – target activation level for nonlinearities
-
class
hessianfree.loss_funcs.LossSet(set)[source]¶ Bases:
hessianfree.loss_funcs.LossFunctionCombines several loss functions into one (e.g., combining
SquaredErrorandSparseL2). It doesn’t need to be created directly; a list of loss functions can be passed toFFNet/RNNetand a LossSet will be created automatically.Parameters: set (list) – list of LossFunction-
group_func(func_name, activities, targets)[source]¶ Computes the given function for each
LossFunctionin the set, and sums the result.
-