Feedforward networks¶
Implementation of feedforward network, including Gauss-Newton approximation for use in Hessian-free optimization.
Based on Martens, J. (2010). Deep learning via Hessian-free optimization. In Proceedings of the 27th International Conference on Machine Learning.
-
class
hessianfree.ffnet.
FFNet
(shape, layers=<hessianfree.nonlinearities.Logistic object at 0x00000197DC688278>, conns=None, loss_type=<hessianfree.loss_funcs.SquaredError object at 0x00000197DC6883C8>, W_init_params=None, use_GPU=False, load_weights=None, debug=False, rng=None, dtype=<class 'numpy.float32'>)[source]¶ Bases:
object
Implementation of feed-forward network (including gradient/curvature computation).
Parameters: - shape (list) – the number of neurons in each layer
- layers (
Nonlinearity
or list) – nonlinearity to use in the network (or a list giving a nonlinearity for each layer) - conns (dict) – dictionary of the form {layer_x:[layer_y, layer_z], ...} specifying the connections between layers (default is to connect in series)
- loss_type (
LossFunction
or list) – loss function (or list of loss functions) used to evaluate network - W_init_params (dict) – parameters passed to
init_weights()
(see parameter descriptions in that function) - use_GPU (bool) – run curvature computation on GPU (requires PyCUDA and scikit-cuda)
- load_weights (str or
ndarray
) – load initial weights from given array or filename - debug (bool) – activates expensive features to help with debugging
- rng (
RandomState
) – used to generate any random numbers for this network (use this to control the seed) - dtype (
dtype
) – floating point precision used throughout the network
-
run_epochs
(inputs, targets, optimizer, max_epochs=100, minibatch_size=None, test=None, test_err=None, target_err=1e-06, plotting=False, file_output=None, print_period=10)[source]¶ Apply the given optimizer with a sequence of (mini)batches.
Parameters: - inputs (
ndarray
orPlant
) – input vectors (or aPlant
that will generate the input vectors dynamically) - targets (
ndarray
) – target vectors corresponding to each input vector (or None if a plant is being used) - optimizer – computes the weight update each epoch (see optimizers.py)
- max_epochs (int) – the maximum number of epochs to run
- minibatch_size (int) – the size of the minibatch to use in each epoch (or None to use full batches)
- test (tuple) – tuple of (inputs,targets) to use as the test data (if None then the same inputs and targets as training will be used)
- test_err (
LossFunction
) – a custom error function to be applied to the test data (e.g., classification error) - target_err (float) – run will terminate if this test error is reached
- file_output (str) – output files from the run will use this as a prefix (if None then don’t output files)
- plotting (bool) – if True then data from the run will be output to a file, which can be displayed via dataplotter.py
- print_period (int) – print out information about the run every x epochs
- inputs (
-
forward
(inputs, params=None, deriv=False)[source]¶ Compute layer activations for given input and parameters.
Parameters:
-
cache_minibatch
(inputs, targets, minibatch=None)[source]¶ Pick a subset of inputs and targets to use in minibatch, and cache the activations for that minibatch.
-
static
J_dot
(J, vec, transpose_J=False, out=None)[source]¶ Compute the product of a Jacobian and some vector.
-
init_weights
(shapes, coeff=1.0, biases=0.0, init_type='sparse')[source]¶ Weight initialization, given shapes of weight matrices.
Note: coeff, biases, and init_type can be specified by the W_init_params dict in
FFNet
. Each can be specified as a single value (for all matrices) or as a list giving a value for each matrix.Parameters: - shapes (list) – list of (pre,post) shapes for each weight matrix
- coeff (float) – scales the magnitude of the connection weights
- biases (float) – bias values for the post of each matrix
- init_type (str) – type of initialization to use (currently supports ‘sparse’, ‘uniform’, ‘gaussian’)
-
get_weights
(params, conn)[source]¶ Get weight matrix for a connection from overall parameter vector.
-
init_loss
(loss_type)[source]¶ Set the loss type for this network to the given
LossFunction
(or a list of functions can be passed to create aLossSet
).