A subsampling methods based on leverage scores proposed by Ma et.al. (2015).
Arguments
- X
A data.frame or matrix of explanatory variables.
- n
Subsample size.
- shrinkage_alpha
Shrinkage for SLEV, default to 1 (do not shrinkage).
- replace
With replacement or without replacement, default to TRUE.
- seed
Random seed for the sampling.
References
Ping Ma, Michael W. Mahoney & Bin Yu (2015) A Statistical Perspective on Algorithmic Leveraging, Journal of Machine Learning Research, 16:27, 861−911, https://jmlr.csail.mit.edu/papers/v16/ma15a.html.
Examples
data <- data_numeric_regression
X <- data[-which(names(data) == "y")]
Leverage(X, n = 100, shrinkage_alpha = 0.9, replace = TRUE, seed = NULL)
#> [1] 7832 9572 9449 8140 4219 13 1633 4363 4384 7691 1287 6985 3433 8341 1439
#> [16] 7331 2455 4514 5427 145 1404 1688 2737 7643 295 40 6636 6488 6886 6970
#> [31] 4710 3102 5731 1490 5825 2104 6841 8521 2040 8216 1950 1857 9049 5673 1659
#> [46] 4878 9147 7663 7100 1116 4946 396 583 9024 3189 4994 8249 8009 2135 6287
#> [61] 8985 4234 8841 3439 8805 8841 5666 2354 6163 4302 8139 8360 2518 6798 67
#> [76] 9302 7260 2172 2584 1882 6943 8431 6800 2404 1049 9303 4247 6289 4385 6192
#> [91] 6121 3386 4443 6626 8517 6828 2541 3076 4328 2288