A model-free subsampling method based on uniform design proposed by Zhang et.al. (2023).
Arguments
- X
A data.frame or matrix of explanatory variables.
- n
Subsample size.
- ratio
Dimensionality reduction ratio of PCA, default to 0.85.
Details
The uniform design is generated according to the method described in part 5 of Zhang's article. The uniformity measure uses mixed divergence.
References
Mei Zhang, Yongdao Zhou, Zheng Zhou & Aijun Zhang (2023) Model-Free Subsampling Method Based on Uniform Designs, IEEE Transactions on Knowledge and Data Engineering, 36:3, 1210-1220, https://ieeexplore.ieee.org/abstract/document/10192374.
Examples
data <- data_numeric_regression
X <- data[-which(names(data) == "y")]
DDS(X, n = 100, ratio = 0.85)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14]
#> [1,] 2238 2843 7010 8283 9340 1738 787 9555 1342 4120 9243 718 688 4792
#> [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26]
#> [1,] 3111 5176 6981 3831 4353 9894 3000 7966 311 8780 9324 1742
#> [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [,37] [,38]
#> [1,] 1390 8222 1827 7583 2335 9181 3109 2041 2395 6539 7812 5616
#> [,39] [,40] [,41] [,42] [,43] [,44] [,45] [,46] [,47] [,48] [,49] [,50]
#> [1,] 1753 5028 3147 460 7741 5427 670 3965 824 4025 406 9813
#> [,51] [,52] [,53] [,54] [,55] [,56] [,57] [,58] [,59] [,60] [,61] [,62]
#> [1,] 8686 6071 87 8553 85 2886 492 6377 3126 3064 423 8448
#> [,63] [,64] [,65] [,66] [,67] [,68] [,69] [,70] [,71] [,72] [,73] [,74]
#> [1,] 611 2565 6673 1971 9727 5396 3745 7576 4819 8385 4268 5835
#> [,75] [,76] [,77] [,78] [,79] [,80] [,81] [,82] [,83] [,84] [,85] [,86]
#> [1,] 6308 5660 7875 4666 4072 3578 2771 1801 4466 5332 4813 9383
#> [,87] [,88] [,89] [,90] [,91] [,92] [,93] [,94] [,95] [,96] [,97] [,98]
#> [1,] 1443 3293 4392 4804 2998 8 2750 6135 5014 657 5944 9737
#> [,99] [,100]
#> [1,] 2651 2354