In this example it is the goal to compute derivatives of the Moore-Penrose pseudoinverse. We compare different possibilities:
- A naive approach where A^T A is explicitly computed (numerically unstable)
- A QR approach where at first a QR decomposition of A is formed and the inverse is computed by a forward and then back substitution of R. T
- A direct approach where an analytic formula for the derivatives of the Moore-Penrose Formula is derived. Then usage of the QR decomposition is used to make the computation of the analytical formula numerically stable.
More explicitly, we compute
where \(A \in \mathbb R^{M \times N}\) with \(M \geq N\) with possibly bad condition number but full column rank.
import numpy
from algopy import CGraph, Function, UTPM, dot, qr, eigh, inv, solve
# first order derivatives, one directional derivative
# D - 1 is the degree of the Taylor polynomial
# P directional derivatives at once
# M number of rows of A
# N number of cols of A
D,P,M,N = 2,1,5,2
# generate badly conditioned matrix A
A = UTPM(numpy.zeros((D,P,M,N)))
x = UTPM(numpy.zeros((D,P,M,1)))
y = UTPM(numpy.zeros((D,P,M,1)))
x.data[0,0,:,0] = [1,1,1,1,1]
x.data[1,0,:,0] = [1,1,1,1,1]
y.data[0,0,:,0] = [1,2,1,2,1]
y.data[1,0,:,0] = [1,2,1,2,1]
alpha = 10**-5
A = dot(x,x.T) + alpha*dot(y,y.T)
A = A[:,:2]
# Method 1: Naive approach
Apinv = dot(inv(dot(A.T,A)),A.T)
print('naive approach: A Apinv A - A = 0 \n', dot(dot(A, Apinv),A) - A)
print('naive approach: Apinv A Apinv - Apinv = 0 \n', dot(dot(Apinv, A),Apinv) - Apinv)
print('naive approach: (Apinv A)^T - Apinv A = 0 \n', dot(Apinv, A).T - dot(Apinv, A))
print('naive approach: (A Apinv)^T - A Apinv = 0 \n', dot(A, Apinv).T - dot(A, Apinv))
# Method 2: Using the differentiated QR decomposition
Q,R = qr(A)
tmp1 = solve(R.T, A.T)
tmp2 = solve(R, tmp1)
Apinv = tmp2
print('QR approach: A Apinv A - A = 0 \n', dot(dot(A, Apinv),A) - A)
print('QR approach: Apinv A Apinv - Apinv = 0 \n', dot(dot(Apinv, A),Apinv) - Apinv)
print('QR approach: (Apinv A)^T - Apinv A = 0 \n', dot(Apinv, A).T - dot(Apinv, A))
print('QR approach: (A Apinv)^T - A Apinv = 0 \n', dot(A, Apinv).T - dot(A, Apinv))
# Method 3: Stable evaluation of the analytical derivative formula
A0 = A.data[0,0]
A1 = A.data[1,0]
Q0, R0 = numpy.linalg.qr(A0)
# compute nominal solution
tmp1 = solve(R0.T, A0.T)
C0 = solve(R0, tmp1)
# compute first directional derivative
tmp2 = A1.T - dot( dot(A1.T, A0) + dot(A0.T, A1), C0)
tmp1 = solve(R0.T, tmp2)
C1 = solve(R0, tmp1)
Apinv.data[0,0] = C0
Apinv.data[1,0] = C1
print('analytical approach: A Apinv A - A = 0 \n', dot(dot(A, Apinv),A) - A)
print('analytical approach: Apinv A Apinv - Apinv = 0 \n', dot(dot(Apinv, A),Apinv) - Apinv)
print('analytical approach: (Apinv A)^T - Apinv A = 0 \n', dot(Apinv, A).T - dot(Apinv, A))
print('analytical approach: (A Apinv)^T - A Apinv = 0 \n', dot(A, Apinv).T - dot(A, Apinv))
Running the code yields the output shown below. One can see that computing the Moore-Penrose pseudoinverse with the naive approach yields a solution (including first directional derivatives) that only poorly fulfills the defining equations (c.f. e.g. http://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_pseudoinverse). The other two methods yield a solution that results in a similarly small residual of the defining equations. Note that the result is a (D,P,M,N) array, i.e., the first block of numbers is the residual of the nominal solution and the second block of numbers is the residual in the first derivatives:
naive approach: A Apinv A - A = 0
[[[[ 1.38013414e-06 1.38012688e-06]
[ 1.38012190e-06 1.38018107e-06]
[ 1.38013414e-06 1.38012688e-06]
[ 1.38012190e-06 1.38018107e-06]
[ 1.38013414e-06 1.38012688e-06]]]
[[[ -3.14117999e-06 -3.14127715e-06]
[ 2.91085793e-05 2.91090665e-05]
[ -3.14117999e-06 -3.14127715e-06]
[ 2.91085793e-05 2.91090665e-05]
[ -3.14117999e-06 -3.14127715e-06]]]]
naive approach: Apinv A Apinv - Apinv = 0
[[[[ 0.22144346 -0.33215987 0.22144346 -0.33215987 0.22144346]
[-0.22144008 0.3321555 -0.22144008 0.3321555 -0.22144008]]]
[[[-1.0878914 0.01930878 -1.0878914 0.01930878 -1.0878914 ]
[ 1.08787702 -0.01930766 1.08787702 -0.01930766 1.08787702]]]]
naive approach: (Apinv A)^T - Apinv A = 0
[[[[ 0.00000000e+00 5.64176865e-08]
[ -5.64176865e-08 0.00000000e+00]]]
[[[ 0.00000000e+00 6.45011636e+00]
[ -6.45011636e+00 0.00000000e+00]]]]
naive approach: (A Apinv)^T - A Apinv = 0
[[[[ 0.00000000e+00 6.20026697e-10 0.00000000e+00 6.20026697e-10
0.00000000e+00]
[ -6.20026697e-10 0.00000000e+00 -6.20026697e-10 0.00000000e+00
-6.20026697e-10]
[ 0.00000000e+00 6.20026697e-10 0.00000000e+00 6.20026697e-10
0.00000000e+00]
[ -6.20026697e-10 0.00000000e+00 -6.20026697e-10 0.00000000e+00
-6.20026697e-10]
[ 0.00000000e+00 6.20026697e-10 0.00000000e+00 6.20026697e-10
0.00000000e+00]]]
[[[ 0.00000000e+00 6.45163836e-06 0.00000000e+00 6.45163836e-06
0.00000000e+00]
[ -6.45163836e-06 0.00000000e+00 -6.45163836e-06 0.00000000e+00
-6.45163836e-06]
[ 0.00000000e+00 6.45163836e-06 0.00000000e+00 6.45163836e-06
0.00000000e+00]
[ -6.45163836e-06 0.00000000e+00 -6.45163836e-06 0.00000000e+00
-6.45163836e-06]
[ 0.00000000e+00 6.45163836e-06 0.00000000e+00 6.45163836e-06
0.00000000e+00]]]]
QR approach: A Apinv A - A = 0
[[[[ -3.18345350e-12 -3.18389759e-12]
[ 1.54691815e-11 1.54698476e-11]
[ -3.18345350e-12 -3.18389759e-12]
[ 1.54691815e-11 1.54698476e-11]
[ -3.18345350e-12 -3.18389759e-12]]]
[[[ 6.38786801e-11 6.38786801e-11]
[ -4.68758365e-11 -4.68767247e-11]
[ 6.38786801e-11 6.38786801e-11]
[ -4.68758365e-11 -4.68767247e-11]
[ 6.38786801e-11 6.38786801e-11]]]]
QR approach: Apinv A Apinv - Apinv = 0
[[[[ 1.94382301e-06 -3.84821760e-06 1.94382301e-06 -3.84821760e-06
1.94382301e-06]
[ -1.94371387e-06 3.84805026e-06 -1.94371387e-06 3.84805026e-06
-1.94371387e-06]]]
[[[ -6.20307401e-06 1.85714744e-05 -6.20307401e-06 1.85714744e-05
-6.20307401e-06]
[ 6.20371429e-06 -1.85723184e-05 6.20371429e-06 -1.85723184e-05
6.20371429e-06]]]]
QR approach: (Apinv A)^T - Apinv A = 0
[[[[ 0.00000000e+00 3.73022496e-06]
[ -3.73022496e-06 0.00000000e+00]]]
[[[ 0.00000000e+00 -2.96084671e-05]
[ 2.96084671e-05 0.00000000e+00]]]]
QR approach: (A Apinv)^T - A Apinv = 0
[[[[ 0.00000000e+00 8.84625706e-13 0.00000000e+00 8.84625706e-13
0.00000000e+00]
[ -8.84625706e-13 0.00000000e+00 -8.84625706e-13 0.00000000e+00
-8.84625706e-13]
[ 0.00000000e+00 8.84625706e-13 0.00000000e+00 8.84625706e-13
0.00000000e+00]
[ -8.84625706e-13 0.00000000e+00 -8.84625706e-13 0.00000000e+00
-8.84625706e-13]
[ 0.00000000e+00 8.84625706e-13 0.00000000e+00 8.84625706e-13
0.00000000e+00]]]
[[[ 0.00000000e+00 9.56390522e-12 0.00000000e+00 9.56390522e-12
0.00000000e+00]
[ -9.56390522e-12 0.00000000e+00 -9.56390522e-12 0.00000000e+00
-9.56390522e-12]
[ 0.00000000e+00 9.56390522e-12 0.00000000e+00 9.56390522e-12
0.00000000e+00]
[ -9.56390522e-12 0.00000000e+00 -9.56390522e-12 0.00000000e+00
-9.56390522e-12]
[ 0.00000000e+00 9.56390522e-12 0.00000000e+00 9.56390522e-12
0.00000000e+00]]]]
analytical approach: A Apinv A - A = 0
[[[[ -3.18345350e-12 -3.18389759e-12]
[ 1.54691815e-11 1.54698476e-11]
[ -3.18345350e-12 -3.18389759e-12]
[ 1.54691815e-11 1.54698476e-11]
[ -3.18345350e-12 -3.18389759e-12]]]
[[[ 2.81423151e-09 2.70797518e-09]
[ -4.25869606e-09 -4.09931600e-09]
[ 2.81423151e-09 2.70797518e-09]
[ -4.25869606e-09 -4.09931600e-09]
[ 2.81423151e-09 2.70797518e-09]]]]
analytical approach: Apinv A Apinv - Apinv = 0
[[[[ 1.94382301e-06 -3.84821760e-06 1.94382301e-06 -3.84821760e-06
1.94382301e-06]
[ -1.94371387e-06 3.84805026e-06 -1.94371387e-06 3.84805026e-06
-1.94371387e-06]]]
[[[ 8.85959818e-01 -1.32856906e+00 8.85959818e-01 -1.32856906e+00
8.85959818e-01]
[ -8.85947414e-01 1.32855046e+00 -8.85947414e-01 1.32855046e+00
-8.85947414e-01]]]]
analytical approach: (Apinv A)^T - Apinv A = 0
[[[[ 0.00000000e+00 3.73022496e-06]
[ -3.73022496e-06 0.00000000e+00]]]
[[[ 0.00000000e+00 -1.39551660e-03]
[ 1.39551660e-03 0.00000000e+00]]]]
analytical approach: (A Apinv)^T - A Apinv = 0
[[[[ 0.00000000e+00 8.84625706e-13 0.00000000e+00 8.84625706e-13
0.00000000e+00]
[ -8.84625706e-13 0.00000000e+00 -8.84625706e-13 0.00000000e+00
-8.84625706e-13]
[ 0.00000000e+00 8.84625706e-13 0.00000000e+00 8.84625706e-13
0.00000000e+00]
[ -8.84625706e-13 0.00000000e+00 -8.84625706e-13 0.00000000e+00
-8.84625706e-13]
[ 0.00000000e+00 8.84625706e-13 0.00000000e+00 8.84625706e-13
0.00000000e+00]]]
[[[ 0.00000000e+00 -1.37629996e-09 0.00000000e+00 -1.37629996e-09
0.00000000e+00]
[ 1.37629996e-09 0.00000000e+00 1.37629996e-09 0.00000000e+00
1.37629996e-09]
[ 0.00000000e+00 -1.37629996e-09 0.00000000e+00 -1.37629996e-09
0.00000000e+00]
[ 1.37629996e-09 0.00000000e+00 1.37629996e-09 0.00000000e+00
1.37629996e-09]
[ 0.00000000e+00 -1.37629996e-09 0.00000000e+00 -1.37629996e-09
0.00000000e+00]]]]