covar.cov_shrink_ss¶
- covar.cov_shrink_ss()¶
Compute a shrinkage estimate of the covariance matrix using the Schafer and Strimmer (2005) method.
Parameters: X : array, shape=(n, p)
Data matrix. Each row represents a data point, and each column represents a feature.
shrinkage : float, optional
The covariance shrinkage intensity (range 0-1). If shrinkage is not specified (the default) it is estimated using an analytic formula from Schafer and Strimmer (2005). For shrinkage=0 the empirical correlations are recovered.
Returns: cov : array, shape=(p, p)
Estimated covariance matrix of the data.
shrinkage : float
The applied covariance shrinkage intensity.
See also
- cov_shrink_rblw
- similar method, using a different shrinkage target, \(T\).
- sklearn.covariance.ledoit_wolf
- very similar approach, but uses a different shrinkage target, \(T\).
Notes
This shrinkage estimator corresponds to “Target D”: (diagonal, unequal variance) as described in [1]. The estimator takes the form
\[\hat{\Sigma} = (1-\gamma) \Sigma_{sample} + \gamma T,\]where \(\Sigma^{sample}\) is the (noisy but unbiased) empirical covariance matrix,
\[\Sigma^{sample}_{ij} = \frac{1}{n-1} \sum_{k=1}^n (x_{ki} - \bar{x}_i)(x_{kj} - \bar{x}_j),\]the matrix \(T\) is the shrinkage target, a less noisy but biased estimator for the covariance, and the scalar \(\gamma \in [0, 1]\) is the shrinkage intensity (regularization strength). This approaches uses a diagonal shrinkage target, \(T\):
\[\begin{split}T_{ij} = \begin{cases} \Sigma^{sample}_{ii} &\text{ if } i = j\\ 0 &\text{ otherwise}, \end{cases}\end{split}\]The idea is that by taking a weighted average of these two estimators, we can get a combined estimator which is more accurate than either is individually, especially when \(p\) is large. The optimal weighting, \(\gamma\), is determined automatically by minimizing the mean squared error. See [1] for details on how this can be done. The formula for \(\gamma\) is
\[\gamma = \frac{\sum_{i \neq j} \hat{Var}(r_{ij})}{\sum_{i \neq j} r^2_{ij}}\]where \(r\) is the sample correlation matrix,
\[r_{ij} = \frac{\Sigma^{sample}_{ij}}{\sigma_i \sigma_j},\]and \(\hat{Var}(r_{ij})\) is given by
\[\hat{Var}(r_{ij}) = \frac{n}{(n-1)^3 \sigma_i^2 \sigma_j^2} \sum_{k=1}^n (w_{kij} - \bar{w}_{ij})^2,\]with \(w_{kij} = (x_{ki} - \bar{x}_i)(x_{kj} - \bar{x}_j)\), and \(\bar{w}_{ij} = \frac{1}{n}\sum_{k=1}^n w_{kij}\).
This method is equivalent to the cov.shrink method in the R package corpcor, if the argument lambda.var is set to 0. See https://cran.r-project.org/web/packages/corpcor/ for details.
References
[R2] Schafer, J., and K. Strimmer. 2005. A shrinkage approach to large-scale covariance estimation and implications for functional genomics. Statist. Appl. Genet. Mol. Biol. 4:32. http://doi.org/10.2202/1544-6115.1175