GPU functions¶

Profiling¶

hessianfree.gpu.profiling.threshold_calc_G()[source]¶

Compare GPU vs CPU performance on feedforward curvature calculation.

This can use this to determine whether it is better to run some target network on the CPU or GPU.

hessianfree.gpu.profiling.threshold_rnn_calc_G()[source]¶

Compare GPU vs CPU performance on recurrent curvature calculation.

This can use this to determine whether it is better to run some target network on the CPU or GPU.

hessianfree.gpu.profiling.profile_calc_G(cprofile=True)[source]¶

Run a profiler on the feedforward curvature calculation.

Parameters:	cprofile (bool) – use True if profiling on the CPU, False if using the CUDA profiler

hessianfree.gpu.profiling.profile_rnn_calc_G(cprofile=True)[source]¶

Run a profiler on the recurrent curvature calculation.

Parameters:	cprofile (bool) – use True if profiling on the CPU, False if using the CUDA profiler

hessianfree.gpu.profiling.profile_dot(cprofile=True)[source]¶

Run a profiler on the matrix multiplication kernel.

Parameters:	cprofile (bool) – use True if profiling on the CPU, False if using the CUDA profiler

Kernels¶

Note: these functions never need to be accessed directly, they will be swapped in automatically when the use_gpu=True flag is set in FFNet/RNNet.

hessianfree.gpu.kernel_wrappers.debug_wrapper(cpu_func, debug=False)[source]¶: Decorator used to specify an equivalent CPU function that can be used to verify the output of a GPU function (for debugging).

hessianfree.gpu.kernel_wrappers.cublas_dot(a, b, out=None, transpose_a=False, transpose_b=False, increment=False, stream=None)[source]¶: Matrix multiplication using CUBLAS.

hessianfree.gpu.kernel_wrappers.J_dot(J, v, out=None, transpose_J=False, increment=False, stream=None)[source]¶: Equivalent to J_dot(), on the GPU.

hessianfree.gpu.kernel_wrappers.sum_cols(a, out=None, increment=False, stream=None)[source]¶: Sum a along columns.

hessianfree.gpu.kernel_wrappers.iadd(a, b, stream=None)[source]¶: In-place addition of a and b, broadcasting b along columns.

hessianfree.gpu.kernel_wrappers.multiply(a, b, out=None, increment=False, stream=None)[source]¶: Element-wise product of a and b.

hessianfree.gpu.kernel_wrappers.shared_dot(a, b, out=None, transpose_a=False, transpose_b=False, increment=False, stream=None)[source]¶: Matrix multiplication that doesn’t rely on CUBLAS (could be swapped in if scikit-cuda were not available for some reason).