Built-in objects¶
Optimizers¶
-
class
hessianfree.optimizers.
Optimizer
[source]¶ Bases:
object
Base class for optimizers.
Each optimizer has a
self.net
parameter that will be set automatically when the optimizer is added to a network (referring to that network).-
compute_update
(printing=False)[source]¶ Compute a weight update for the current batch.
It can be assumed that the batch has already been stored in
net.inputs
andnet.targets
, and the nonlinearity activations/derivatives for the batch are cached innet.activations
andnet.d_activations
.Parameters: printing (bool) – if True, print out data about the optimization
-
-
class
hessianfree.optimizers.
HessianFree
(CG_iter=250, init_damping=1, plotting=True)[source]¶ Bases:
hessianfree.optimizers.Optimizer
Use Hessian-free optimization to compute the weight update.
Parameters: - CG_iter (int) – maximum number of CG iterations to run per epoch
- init_damping (float) – the initial value of the Tikhonov damping
- plotting (bool) – if True then collect data for plotting (actual plotting handled in parent network)
-
class
hessianfree.optimizers.
SGD
(l_rate=1, plotting=False)[source]¶ Bases:
hessianfree.optimizers.Optimizer
Compute weight update using first-order gradient descent.
Parameters: - l_rate – learning rate to apply to weight updates
- plotting – if True then collect data for plotting (actual plotting handled in parent network)
Nonlinearities¶
-
class
hessianfree.nonlinearities.
Nonlinearity
(stateful=False)[source]¶ Bases:
object
Base class for layer nonlinearities.
Parameters: stateful (boolean) – True if this nonlinearity has internal state (in which case it needs to return d_input
,d_state
, andd_output
ind_activation()
; seeContinuous
for an example)-
activation
(x)[source]¶ Apply the nonlinearity to the inputs.
Parameters: x – input to the nonlinearity
-
-
class
hessianfree.nonlinearities.
Tanh
[source]¶ Bases:
hessianfree.nonlinearities.Nonlinearity
Hyperbolic tangent function
\(f(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}\)
-
class
hessianfree.nonlinearities.
Logistic
[source]¶ Bases:
hessianfree.nonlinearities.Nonlinearity
Logistic sigmoid function
\(f(x) = \frac{1}{1 + e^{-x}}\)
Note: if scipy is installed then this will use the slightly faster
scipy.special.expit
-
class
hessianfree.nonlinearities.
Linear
[source]¶ Bases:
hessianfree.nonlinearities.Nonlinearity
Linear activation function (passes inputs unchanged).
\(f(x) = x\)
-
class
hessianfree.nonlinearities.
ReLU
(max=10000000000.0)[source]¶ Bases:
hessianfree.nonlinearities.Nonlinearity
Rectified linear unit
\(f(x) = max(x, 0)\)
Parameters: max – an upper bound on activation to help avoid numerical errors
-
class
hessianfree.nonlinearities.
Gaussian
[source]¶ Bases:
hessianfree.nonlinearities.Nonlinearity
Gaussian activation function
\(f(x) = e^{-x^2}\)
-
class
hessianfree.nonlinearities.
Softmax
[source]¶ Bases:
hessianfree.nonlinearities.Nonlinearity
Softmax activation function
\(f(x_i) = \frac{e^{x_i}}{\sum_j{e^{x_j}}}\)
-
class
hessianfree.nonlinearities.
SoftLIF
(sigma=1, tau_rc=0.02, tau_ref=0.002, amp=0.01)[source]¶ Bases:
hessianfree.nonlinearities.Nonlinearity
SoftLIF activation function
Based on Hunsberger, E. and Eliasmith, C. (2015). Spiking deep networks with LIF neurons. arXiv:1510.08829.
\[f(x) = \frac{amp}{ \tau_{ref} + \tau_{RC} log(1 + \frac{1}{\sigma log(1 + e^{\frac{x}{\sigma}})})}\]Note: this is equivalent to \(LIF(SoftReLU(x))\)
Parameters: - sigma (float) – controls the smoothness of the nonlinearity threshold
- tau_rc (float) – LIF RC time constant
- tau_ref (float) – LIF refractory time constant
- amp (float) – scales output of nonlinearity
-
class
hessianfree.nonlinearities.
Continuous
(base, tau=1.0, dt=1.0)[source]¶ Bases:
hessianfree.nonlinearities.Nonlinearity
Creates a version of the base nonlinearity that operates in continuous time (filtering inputs with the given tau/dt).
\[\frac{ds}{dt} = \frac{x - s}{\tau}\]\[f(x) = base(s)\]Parameters: - base (
Nonlinearity
) – nonlinear output function applied to the continuous state - tau (float) – time constant of input filter (higher value means the internal state changes more slowly)
- dt (float) – simulation time step
- base (
-
class
hessianfree.nonlinearities.
Plant
(stateful=True)[source]¶ Bases:
hessianfree.nonlinearities.Nonlinearity
Base class for a plant that can be called to dynamically generate inputs for a network.
See
demos.plant()
for an example of this being used in practice.-
__call__
(x)[source]¶ Update the internal state of the plant based on input.
Parameters: x – the output of the last layer in the network on the previous timestep
-
get_vecs
()[source]¶ Return a tuple of the (inputs,targets) vectors generated by the plant since the last reset.
-
reset
(init=None)[source]¶ Reset the plant to initial state.
Parameters: init – override the default initial state with these values
-
Loss functions¶
-
class
hessianfree.loss_funcs.
LossFunction
[source]¶ Bases:
object
Defines a loss function that maps nonlinearity activations to error.
-
loss
(activities, targets)[source]¶ Computes the loss for each unit in the network.
Note that most loss functions are only based on the output of the final layer, activities[-1]. However, we pass the activities of all layers here so that loss functions can include things like sparsity constraints. Targets, however, are only defined for the output layer.
Targets can be defined as
np.nan
, which will be translated into zero error.Parameters: - activities (list) – output activations of each layer
- targets (
ndarray
) – target activation values for last layer
-
d_loss
(activities, targets)[source]¶ First derivative of loss function (with respect to activities).
-
-
hessianfree.loss_funcs.
output_loss
(func)[source]¶ Convenience decorator that takes a loss defined for the output layer and converts it into the more general form in terms of all layers.
-
class
hessianfree.loss_funcs.
SquaredError
[source]¶ Bases:
hessianfree.loss_funcs.LossFunction
Squared error
\(\frac{1}{2} \sum(output - target)^2\)
-
class
hessianfree.loss_funcs.
CrossEntropy
[source]¶ Bases:
hessianfree.loss_funcs.LossFunction
Cross-entropy error
\(-\sum(target * log(output))\)
-
class
hessianfree.loss_funcs.
ClassificationError
[source]¶ Bases:
hessianfree.loss_funcs.LossFunction
Classification error
\(argmax(output) \neq argmax(target)\)
Note:
d_loss
andd2_loss
are not defined; classification error should only be used for validation, which doesn’t require either.
-
class
hessianfree.loss_funcs.
StructuralDamping
(weight, layers=None, optimizer=None)[source]¶ Bases:
hessianfree.loss_funcs.LossFunction
Applies structural damping, which penalizes layers for having highly variable output activity.
Note: this is not exactly the same as the structural damping in Martens (2010), because it is applied on the output side of the nonlinearity (meaning that this error will be filtered through
d_activations
during the backwards propagation).Parameters: - weight (float) – scale on structural damping relative to other losses
- layers (list) – indices specifying which layers will have the damping applied (defaults to all except first/last layers)
- optimizer (
Optimizer
) – if provided, the weight on structural damping will be scaled relative to thedamping
attribute in the optimizer (so that any processes dynamically adjusting the damping during the optimization will also affect the structural damping)
-
class
hessianfree.loss_funcs.
SparseL1
(weight, layers=None, target=0.0)[source]¶ Bases:
hessianfree.loss_funcs.LossFunction
Imposes L1 sparsity constraint on nonlinearity activations.
Parameters: - weight (float) – relative weight of sparsity constraint
- layers (list) – indices specifying which layers will have the sparsity constraint applied (defaults to all except first/last layers)
- target (float) – target activation level for nonlinearities
-
class
hessianfree.loss_funcs.
SparseL2
(weight, layers=None, target=0.0)[source]¶ Bases:
hessianfree.loss_funcs.LossFunction
Imposes L2 sparsity constraint on nonlinearity activations.
Parameters: - weight (float) – relative weight of sparsity constraint
- layers (list) – indices specifying which layers will have the sparsity constraint applied (defaults to all except first/last layers)
- target (float) – target activation level for nonlinearities
-
class
hessianfree.loss_funcs.
LossSet
(set)[source]¶ Bases:
hessianfree.loss_funcs.LossFunction
Combines several loss functions into one (e.g., combining
SquaredError
andSparseL2
). It doesn’t need to be created directly; a list of loss functions can be passed toFFNet
/RNNet
and a LossSet will be created automatically.Parameters: set (list) – list of LossFunction
-
group_func
(func_name, activities, targets)[source]¶ Computes the given function for each
LossFunction
in the set, and sums the result.
-