covar.cov_shrink_ss¶

covar.cov_shrink_ss()¶

Compute a shrinkage estimate of the covariance matrix using the Schafer and Strimmer (2005) method.

Parameters:

X : array, shape=(n, p)

Data matrix. Each row represents a data point, and each column represents a feature.

shrinkage : float, optional

The covariance shrinkage intensity (range 0-1). If shrinkage is not specified (the default) it is estimated using an analytic formula from Schafer and Strimmer (2005). For shrinkage=0 the empirical correlations are recovered.

Returns:

cov : array, shape=(p, p)

Estimated covariance matrix of the data.

shrinkage : float

The applied covariance shrinkage intensity.

See also

cov_shrink_rblw: similar method, using a different shrinkage target, $T$ .
sklearn.covariance.ledoit_wolf: very similar approach, but uses a different shrinkage target, $T$ .

Notes

This shrinkage estimator corresponds to “Target D”: (diagonal, unequal variance) as described in [1]. The estimator takes the form

$\hat{\Sigma} = (1-\gamma) \Sigma_{sample} + \gamma T,$

where $\Sigma^{sample}$ is the (noisy but unbiased) empirical covariance matrix,

$\Sigma^{sample}_{ij} = \frac{1}{n-1} \sum_{k=1}^n (x_{ki} - \bar{x}_i)(x_{kj} - \bar{x}_j),$

the matrix $T$ is the shrinkage target, a less noisy but biased estimator for the covariance, and the scalar $\gamma \in [0, 1]$ is the shrinkage intensity (regularization strength). This approaches uses a diagonal shrinkage target, $T$ :

$\begin{split}T_{ij} = \begin{cases} \Sigma^{sample}_{ii} &\text{ if } i = j\\ 0 &\text{ otherwise}, \end{cases}\end{split}$

The idea is that by taking a weighted average of these two estimators, we can get a combined estimator which is more accurate than either is individually, especially when $p$ is large. The optimal weighting, $\gamma$ , is determined automatically by minimizing the mean squared error. See [1] for details on how this can be done. The formula for $\gamma$ is

$\gamma = \frac{\sum_{i \neq j} \hat{Var}(r_{ij})}{\sum_{i \neq j} r^2_{ij}}$

where $r$ is the sample correlation matrix,

$r_{ij} = \frac{\Sigma^{sample}_{ij}}{\sigma_i \sigma_j},$

and $\hat{Var}(r_{ij})$ is given by

$\hat{Var}(r_{ij}) = \frac{n}{(n-1)^3 \sigma_i^2 \sigma_j^2} \sum_{k=1}^n (w_{kij} - \bar{w}_{ij})^2,$

with $w_{kij} = (x_{ki} - \bar{x}_i)(x_{kj} - \bar{x}_j)$ , and $\bar{w}_{ij} = \frac{1}{n}\sum_{k=1}^n w_{kij}$ .

This method is equivalent to the cov.shrink method in the R package corpcor, if the argument lambda.var is set to 0. See https://cran.r-project.org/web/packages/corpcor/ for details.

References

[R2]	Schafer, J., and K. Strimmer. 2005. A shrinkage approach to large-scale covariance estimation and implications for functional genomics. Statist. Appl. Genet. Mol. Biol. 4:32. http://doi.org/10.2202/1544-6115.1175