Previous topic

dynstatcov.Dynstatcov.update

Next topic

dynstatcov.Dynstatcov.get_cov

This Page

dynstatcov.Dynstatcov

class dynstatcov.Dynstatcov

Dynamically updateable statistical co-variance matrix.

Implements a minimal method for dynamic updates of a statistic co-variance matrix upon the arrival of a new observation. Optimized for speed and low memory requirements: the computational and memory requirements depend only on the number of features in each observation vector and not on the number of observations.

Parameters:

X : ndarray

Two-dimensional row-major/C-ordered matrix with samples.

Notes

Dynstatcov can be compiled with either single (numpy.float32) or double precision (numpy.float64). You might have to test, which version you are running. By default, its double precision. If required, you can always change typedef of DTYPE_t in the cython code and re-compile.

Examples

Example with initially three observations of four features each, which is updated once.

>>> import numpy as np
>>> from dynstatcov import Dynstatcov
>>> X = np.array([[3, 4, 1, 2],
                  [4, 4, 0, 3],
                  [1, 3, 4, 2]], dtype=numpy.float64)
>>> dsc = Dynstatcov(X)
>>> dsc.get_cov()
array([[ 2.33333397,  0.83333397, -3.16666698,  0.66666603],
       [ 0.83333397,  0.33333206, -1.16666603,  0.16666603],
       [-3.16666698, -1.16666603,  4.33333302, -0.83333397],
       [ 0.66666603,  0.16666603, -0.83333397,  0.33333397]], dtype=float64)
>>> dsc.get_n_samples()
3
>>> y = np.array([3, 4, 1, 2], dtype=numpy.float64)
>>> dsc.update(y)
>>> dsc.get_cov()
array([[ 1.58333337,  0.58333337, -2.16666675,  0.41666669],
       [ 0.58333337,  0.25      , -0.83333337,  0.08333334],
       [-2.16666675, -0.83333337,  3.        , -0.5       ],
       [ 0.41666669,  0.08333334, -0.5       ,  0.25      ]], dtype=float64)

Internally, only the upper part of the symmetrical co-variance matrix is stored and the full matrix reconstructed on each call to get_cov(). Alternatively, you can also request the upper triangular matrix, which is slightly faster.

>>> dsc.get_cov_tri()
array([ 1.58333337,  0.58333337, -2.16666675,  0.41666669,  0.25      ,
       -0.83333337,  0.08333334,  3.        , -0.5       ,  0.25      ], dtype=float64)

We can also remove a sample setting the optional second argument of update() non-zero.

>>> dsc.update(y, 1)
>>> dsc.get_cov()
array([[ 2.33333397,  0.83333397, -3.16666698,  0.66666603],
       [ 0.83333397,  0.33333206, -1.16666603,  0.16666603],
       [-3.16666698, -1.16666603,  4.33333302, -0.83333397],
       [ 0.66666603,  0.16666603, -0.83333397,  0.33333397]], dtype=float64)

Methods

get_cov Full co-variance matrix.
get_cov_tri Upper triangular part of the co-variance matrix.
get_n_samples Number of samples.
update Add a new sample and update the co-variance matrix.