GPU functions



Compare GPU vs CPU performance on feedforward curvature calculation.

This can use this to determine whether it is better to run some target network on the CPU or GPU.


Compare GPU vs CPU performance on recurrent curvature calculation.

This can use this to determine whether it is better to run some target network on the CPU or GPU.


Run a profiler on the feedforward curvature calculation.

Parameters:cprofile (bool) – use True if profiling on the CPU, False if using the CUDA profiler

Run a profiler on the recurrent curvature calculation.

Parameters:cprofile (bool) – use True if profiling on the CPU, False if using the CUDA profiler

Run a profiler on the matrix multiplication kernel.

Parameters:cprofile (bool) – use True if profiling on the CPU, False if using the CUDA profiler


Note: these functions never need to be accessed directly, they will be swapped in automatically when the use_gpu=True flag is set in FFNet/RNNet.

hessianfree.gpu.kernel_wrappers.debug_wrapper(cpu_func, debug=False)[source]

Decorator used to specify an equivalent CPU function that can be used to verify the output of a GPU function (for debugging).

hessianfree.gpu.kernel_wrappers.cublas_dot(a, b, out=None, transpose_a=False, transpose_b=False, increment=False, stream=None)[source]

Matrix multiplication using CUBLAS.

hessianfree.gpu.kernel_wrappers.J_dot(J, v, out=None, transpose_J=False, increment=False, stream=None)[source]

Equivalent to J_dot(), on the GPU.

hessianfree.gpu.kernel_wrappers.sum_cols(a, out=None, increment=False, stream=None)[source]

Sum a along columns.

hessianfree.gpu.kernel_wrappers.iadd(a, b, stream=None)[source]

In-place addition of a and b, broadcasting b along columns.

hessianfree.gpu.kernel_wrappers.multiply(a, b, out=None, increment=False, stream=None)[source]

Element-wise product of a and b.

hessianfree.gpu.kernel_wrappers.shared_dot(a, b, out=None, transpose_a=False, transpose_b=False, increment=False, stream=None)[source]

Matrix multiplication that doesn’t rely on CUBLAS (could be swapped in if scikit-cuda were not available for some reason).