AlgoPy, Algorithmic Differentiation in Python¶

What is AlgoPy?¶

The purpose of AlgoPy is the evaluation of higher-order derivatives in the forward and reverse mode of Algorithmic Differentiation (AD) of functions that are implemented as Python programs. Particular focus are functions that contain numerical linear algebra functions as they often appear in statistically motivated functions. The intended use of AlgoPy is for easy prototyping at reasonable execution speeds. More precisely, for a typical program a directional derivative takes order 10 times as much time as time as the function evaluation. This is approximately also true for the gradient.

What can AlgoPy do for you?¶

evaluation of derivatives useful for nonlinear continuous optimization

gradient

Jacobian

Hessian

Jacobian vector product

vector Jacobian product

Hessian vector product

vector Hessian vector product

higher-order tensors

Taylor series evaluation

for modeling higher-order processes

could be used to compute Taylor series expansions for ODE/DAE integration.

Getting Started:¶

For the impatient, we show a minimalistic example how AlgoPy can be used to compute derivatives.

import numpy, algopy
from algopy import UTPM, exp

def eval_f(x):
    """ some function """
    return x[0]*x[1]*x[2] + exp(x[0])*x[1]

# forward mode without building the computational graph
# -----------------------------------------------------
x = UTPM.init_jacobian([3,5,7])
y = eval_f(x)
algopy_jacobian = UTPM.extract_jacobian(y)
print('jacobian = ',algopy_jacobian)

# reverse mode using a computational graph
# ----------------------------------------

# STEP 1: trace the function evaluation
cg = algopy.CGraph()
x = algopy.Function([1,2,3])
y = eval_f(x)
cg.trace_off()
cg.independentFunctionList = [x]
cg.dependentFunctionList = [y]

# STEP 2: use the computational graph to evaluate derivatives
print('gradient =', cg.gradient([3.,5,7]))
print('Jacobian =', cg.jacobian([3.,5,7]))
print('Hessian =', cg.hessian([3.,5.,7.]))
print('Hessian vector product =', cg.hess_vec([3.,5.,7.],[4,5,6]))

If one executes that code one obtains as output:

$ python getting_started.py
jacobian =  [ 135.42768462   41.08553692   15.        ]
gradient = [array([ 135.42768462,   41.08553692,   15.        ])]
Jacobian = [array([[ 135.42768462,   41.08553692,   15.        ]])]
Hessian = [[[ 100.42768462   27.08553692    5.        ]
  [  27.08553692    0.            3.        ]
  [   5.            3.            0.        ]]]
Hessian vector product = [ 567.13842308  126.34214769   35.        ]

Help improve AlgoPy¶

If you have any questions, suggestions or bug reports please use the mailing list http://groups.google.com/group/algopy?msg=new&lnk=gcis or alternatively write me an email(sebastian.walter@gmail.com). This will make it much easier for me to provide code/documentation that is easier to understand. Of course, you are also welcome to contribute code, bugfixes, examples, success stories ;), ...

Issues and common pitfalls¶

Warning

Do not use numpy.ndarray with algopy.UTPM or algopy.Function as elements!

Example 1:

import numpy, algopy

x = algopy.Function(1)
y = algopy.Function(2)

# wrong
z = numpy.array([x,y])

# right
z = algopy.zeros(2, dtype=x)
z[0] = x
z[1] = y

Example 2:

x = algopy.UTPM(numpy.ones((2,1)))
y = algopy.UTPM(2*numpy.ones((2,1)))

# wrong
z = numpy.array([x,y])

# right
z = algopy.zeros(2, dtype=x)
z[0] = x
z[1] = y

Note

If you use NumPy version > 1.5 the reverse mode could be very slow if you use broadcasting.

Reason:

Numpy version 1.5 and above have a bug in binary inplace operations (imul, iadd, ...) when array elements point to overlapping memory regions, e.g., when strides = (0,8).

There is a workaround in AlgoPy for this case, but it is probably rather slow for large matrices since a Python loops needs to access all elements in the matrix.

See discussion on http://thread.gmane.org/gmane.comp.python.numeric.general/51379

and the issue on https://github.com/numpy/numpy/issues/2705

This bug is only relevant in the reverse mode and you use broadcasting, e.g., when you run the following example:

import numpy, algopy

def f(x):
    N, = x.shape
    A = algopy.zeros((N,N), dtype=x)
    A[0,0] = x[2]
    A[1,2] = x[1]
    return algopy.sum(A*x)


cg = algopy.CGraph()

x = algopy.Function(numpy.ones(3))
y = f(x)

cg.independentFunctionList = [x]
cg.dependentFunctionList = [y]

print cg.gradient(numpy.array([1.,2.,3.]))

Potential Improvements¶

better memory management to speed things up

direct support for nested derivatives (requires the algorithms to be generic)

better complex number support, in particular also the reverse mode

support for sparse Jacobian and sparse Hessian computations using graph coloring as explained in http://portal.acm.org/citation.cfm?id=1096222

Installation and Upgrade:¶

AlgoPy works on Python 2.6+ and Python 3.0+.

Current version is 0.5.1

Official releases:

available at: http://pypi.python.org/pypi/algopy
with pip
- $ pip install algopy
with easy_install
- $ easy_install algopy for installation
- $ easy_install –upgrade algopy to upgrade to the newest version
NetBSD package
- http://pkgsrc.se/wip/py-algopy
- http://cvsweb.se.netbsd.org/cgi-bin/bsdweb.cgi/wip/py-algopy

Bleeding edge:

the most recent version is available at https://github.com/b45ch1/algopy

includes additional documentation, e.g. talks and additional examples and the sphinx *.rst documents

Dependencies:

ALGOPY Core:

numpy (known to work with numpy >= 1.6.2)

scipy (known to work with scipy >= 0.11.0)

ALGOPY Examples:

pyadolc

Run tests:

Nose

Documentation:

sphinx

matplotlib, mayavi2, yapgvb

You can install these dependencies with pip:
pip install nose
pip install numpy
pip install scipy
Or on Debian/Ubuntu:
sudo apt-get install -qq python-sphinx python-nose python$-numpy python-scipy
or:
sudo apt-get install -qq python3-sphinx python3-nose python3$-numpy python3-scipy

Note

If your scipy-version is too outdated, you can install a more recent version using:

echo 'yes' | sudo add-apt-repository ppa:pylab/stable
sudo apt-get update

Please note that this adds an unofficial package repository to your system.

How does it work?:¶

AlgoPy offers the forward mode and the reverse mode of AD.

Forward Mode of AD:

The basic idea is the computation on (univariate) Taylor polynomials with with matrix coefficients. More precisely, AlgoPy supports univariate Taylor polynomial (UTP) arithmetic where the coefficients of the polynomial are numpy.ndarrays. To distinguish Taylor polynomials from real vectors resp. matrices they are written with enclosing brackets:

\[[x]_D = [x_0, \dots, x_{D-1}] = \sum_{d=0}^{D-1} x_d T^d \;,\]

where each $x_0, x_1, \dots$ are arrays, e.g. a (5,7) array. This mathematical object is described by numpy.ndarray with shape (D,P, 5,7). The $T$ is an indeterminate, i.e. a formal/dummy variable. Roughly speaking, this is the UTP equivalent to the imaginary number $i$ in complex arithmetic. The P can be used to compute several Taylor expansions at once. I.e., a vectorization to avoid the recomputation of the same functions with different inputs.

If the input UTPs are correctly initialized one can interpret the coefficients of the resulting polynomial as higher-order derivatives. Have a look at the Taylor series expansion example for a more detailed discussion.

Reverse Mode of AD:

To allow the use of the reverse mode of AD a simple code tracer has been implemented in algopy.tracer. The idea is to record the computation procedure in a datastructure s.t. the control flow sequence can walked in reverse order. There is no complete documentation for the reverse mode yet.

Current Issues:¶

some algorithms require vectors to be columns of a matrix. I.e. if x is a vector it should be initialized as x = UTPM(numpy.random.rand(D,P,N,1) and not as UTPM(numpy.random.rand(D,P,N)) as one would typically do it using numpy.

there is no vectorized reverse mode yet. That means that one can compute columns of a Jacobian of dimension (M,N) by propagating N directional derivatives at once. In the reverse mode one would like to propagate M adjoint directions at once. However, this is not implemented yet, i.e. one has to repeat the procedure M times.

Version Changelog¶

Version 0.2.2
- fixed some broadcasting bugs with UTPM instances
- fixed a bug in algopy.zeros
Version 0.2.3
- added UTPM.init_jacobian and UTPM.extract_jacobian
- added UTPM.init_hessian and UTPM.extract_hessian
- added UTPM.init_tensor and UTPM.extract_tensor
- added UTPM.__len__, i.e. len(x) works now for x a UTPM instance
- fixed a bug in algopy.dot(x,y) in the case when x is a numpy.ndarray and y is a UTPM instance
Version 0.3.0:
- renamed push_forward to pushforward, this is more consistent w.r.t. the pullback
- UTPM.__repr__ now returns a string of the form UTPM(...)
- refactored the tracer: it should now be possible to trace the function evaluation with normal numpy.ndarrays. After that, one can use cg.pushforward with UTPM instances or call cg.gradient, etc.
- UTPM.reshape is now a method, not a class method
- added broadcasting support for __setitem__, iadd and isub
- added Function.ndim
- added preliminary complex numbers support for arithmetic with UTPM instances (reverse mode using the tracer is not supported yet)
- UTPM.reshape now can also take integers as input, not only tuples of integers
- added UTPM.tan, UTPM.arcsin, UTPM.arccos, UTPM.arctan, UTPM.sinh, UTPM.cosh, UTPM.tanh
- made init_hessian and extract_hessian generic (to make it useful for complex valued functions)
- added comparison operators <,>,<=,>=,== to UTPM
- added UTPM.init_jac_vec and UTPM.extract_jac_vec
- added CGraph.function, CGraph.gradient, CGraph.hessian, CGraph.hess_vec
Version 0.3.1
- replaced algopy.sum by a faster implementation
- fixed a bug in getitem of the UTPM instance: now works also with numpy.int64 as index
- added dedicated algopy.sum and algopy.prod
- added UTPM.pb_sqrt
- fixed bug in tracing operations involving neg(x)
- added algopy.outer
- changed API of CGraph.hessian, CGraph.jac_vec etc. One has now to write CGraph.jacobian(x) instead of CGraph.jacobian([x]).
Version 0.3.2
- improved error reporting in the reverse mode: when “something goes wrong” in cg.gradient([1.,2.,3.]) one now gets a much more detailed traceback
- added A.reshape(...) support to the reverse mode
- improved support for broadcasting for UTPM instances
Version 0.4.0
- added support for a variety of new functions, mostly contributed by Alex Griffing, NCSU: expm, hyp1f1, hyperu, hyp2f0, polygamma, psi, erf, erfi, dawsn, logit, expit
Version 0.5.0
- add Python 3 compatibility
- add Travis CI

Unit Test¶

AlgoPy uses the same testing facilitities as NumPy. I.e., one can run the complete unit test with:

$ python -c "import algopy; algopy.test()"

Table Of Contents

Next topic

This Page

AlgoPy, Algorithmic Differentiation in Python¶

What is AlgoPy?¶

What can AlgoPy do for you?¶

Getting Started:¶

Help improve AlgoPy¶

Issues and common pitfalls¶

Potential Improvements¶

Installation and Upgrade:¶

How does it work?:¶

Further Reading¶

Current Issues:¶

Version Changelog¶

Unit Test¶

Indices and tables¶

Navigation

Table Of Contents

Next topic

This Page

Quick search

AlgoPy, Algorithmic Differentiation in Python¶

What is AlgoPy?¶

What can AlgoPy do for you?¶

Getting Started:¶

Help improve AlgoPy¶

Issues and common pitfalls¶

Potential Improvements¶

Related Work¶

Publications related to AlgoPy¶

Installation and Upgrade:¶

How does it work?:¶

Further Reading¶

Current Issues:¶

Version Changelog¶

Unit Test¶

Indices and tables¶

Navigation