covar.cov_shrink_ss

covar.cov_shrink_ss()

Compute a shrinkage estimate of the covariance matrix using the Schafer and Strimmer (2005) method.

Parameters:

X : array, shape=(n, p)

Data matrix. Each row represents a data point, and each column represents a feature.

shrinkage : float, optional

The covariance shrinkage intensity (range 0-1). If shrinkage is not specified (the default) it is estimated using an analytic formula from Schafer and Strimmer (2005). For shrinkage=0 the empirical correlations are recovered.

Returns:

cov : array, shape=(p, p)

Estimated covariance matrix of the data.

shrinkage : float

The applied covariance shrinkage intensity.

See also

cov_shrink_rblw
similar method, using a different shrinkage target, \(T\).
sklearn.covariance.ledoit_wolf
very similar approach, but uses a different shrinkage target, \(T\).

Notes

This shrinkage estimator corresponds to “Target D”: (diagonal, unequal variance) as described in [1]. The estimator takes the form

\[\hat{\Sigma} = (1-\gamma) \Sigma_{sample} + \gamma T,\]

where \(\Sigma^{sample}\) is the (noisy but unbiased) empirical covariance matrix,

\[\Sigma^{sample}_{ij} = \frac{1}{n-1} \sum_{k=1}^n (x_{ki} - \bar{x}_i)(x_{kj} - \bar{x}_j),\]

the matrix \(T\) is the shrinkage target, a less noisy but biased estimator for the covariance, and the scalar \(\gamma \in [0, 1]\) is the shrinkage intensity (regularization strength). This approaches uses a diagonal shrinkage target, \(T\):

\[\begin{split}T_{ij} = \begin{cases} \Sigma^{sample}_{ii} &\text{ if } i = j\\ 0 &\text{ otherwise}, \end{cases}\end{split}\]

The idea is that by taking a weighted average of these two estimators, we can get a combined estimator which is more accurate than either is individually, especially when \(p\) is large. The optimal weighting, \(\gamma\), is determined automatically by minimizing the mean squared error. See [1] for details on how this can be done. The formula for \(\gamma\) is

\[\gamma = \frac{\sum_{i \neq j} \hat{Var}(r_{ij})}{\sum_{i \neq j} r^2_{ij}}\]

where \(r\) is the sample correlation matrix,

\[r_{ij} = \frac{\Sigma^{sample}_{ij}}{\sigma_i \sigma_j},\]

and \(\hat{Var}(r_{ij})\) is given by

\[\hat{Var}(r_{ij}) = \frac{n}{(n-1)^3 \sigma_i^2 \sigma_j^2} \sum_{k=1}^n (w_{kij} - \bar{w}_{ij})^2,\]

with \(w_{kij} = (x_{ki} - \bar{x}_i)(x_{kj} - \bar{x}_j)\), and \(\bar{w}_{ij} = \frac{1}{n}\sum_{k=1}^n w_{kij}\).

This method is equivalent to the cov.shrink method in the R package corpcor, if the argument lambda.var is set to 0. See https://cran.r-project.org/web/packages/corpcor/ for details.

References

[R2]Schafer, J., and K. Strimmer. 2005. A shrinkage approach to large-scale covariance estimation and implications for functional genomics. Statist. Appl. Genet. Mol. Biol. 4:32. http://doi.org/10.2202/1544-6115.1175