covar.cov_shrink_rblw

covar.cov_shrink_rblw()

Compute a shrinkage estimate of the covariance matrix using the Rao-Blackwellized Ledoit-Wolf estimator described by Chen et al.

Parameters:

S : array, shape=(n, n)

Sample covariance matrix (e.g. estimated with np.cov(X.T))

n : int

Number of data points used in the estimate of S.

shrinkage : float, optional

The covariance shrinkage intensity (range 0-1). If shrinkage is not specified (the default) it is estimated using an analytic formula from Chen et al. (2009).

Returns:

sigma : array, shape=(p, p)

Estimated shrunk covariance matrix

shrinkage : float

The applied covariance shrinkage intensity.

See also

cov_shrink_ss
similar method, using a different shrinkage target, \(T\).
sklearn.covariance.ledoit_wolf
very similar approach using the same shrinkage target, \(T\), but a different method for estimating the shrinkage intensity, \(gamma\).

Notes

This shrinkage estimator takes the form

\[\hat{\Sigma} = (1-\gamma) \Sigma_{sample} + \gamma T\]

where \(\Sigma^{sample}\) is the (noisy but unbiased) empirical covariance matrix,

\[\Sigma^{sample}_{ij} = \frac{1}{n-1} \sum_{k=1}^n (x_{ki} - \bar{x}_i)(x_{kj} - \bar{x}_j),\]

the matrix \(T\) is the shrinkage target, a less noisy but biased estimator for the covariance, and the scalar \(\gamma \in [0, 1]\) is the shrinkage intensity (regularization strength). This approaches uses a scaled identity target, \(T\):

\[T = \frac{\mathrm{Tr}(S)}{p} I_p\]

The shrinkage intensity, \(\gamma\), is determined using the RBLW estimator from [2]. The formula for \(\gamma\) is

\[\gamma = \min(\alpha + \frac{\beta}{U})\]

where \(\alpha\), \(\beta\), and \(U\) are

\[\begin{split}\alpha &= \frac{n-2}{n(n+2)} \\ \beta &= \frac{(p+1)n - 2}{n(n+2)} \\ U &= \frac{p\, \mathrm{Tr}(S^2)}{\mathrm{Tr}^2(S)} - 1\end{split}\]

One particularly useful property of this estimator is that it’s very fast, because it doesn’t require access to the data matrix at all (unlike cov_shrink_ss()). It only requires the sample covariance matrix and the number of data points n, as sufficient statistics.

For reference, note that [2] defines another estimator, called the oracle approximating shrinkage estimator (OAS), but makes some mathematical errors during the derivation, and futhermore their example code published with the paper does not implement the proposed formulas.

References

[R2]Chen, Yilun, Ami Wiesel, and Alfred O. Hero III. “Shrinkage estimation of high dimensional covariance matrices” ICASSP (2009) http://doi.org/10.1109/ICASSP.2009.4960239