I have a "my.dataset" like this:
ID Species SEX Category V1 V2 V3
87790 Caniceps F F_Caniceps -0.34 -0.55 0.61
199486 Caniceps F F_Caniceps -0.34 -0.56 0.63
199490 Caniceps F F_Caniceps -0.37 -0.54 0.57
199493 Caniceps F F_Caniceps -0.35 -0.54 0.58
200139 Caniceps F F_Caniceps -0.39 -0.51 0.51
393151 Caniceps M M_Caniceps -0.36 -0.56 0.55
393154 Caniceps M M_Caniceps -0.36 -0.55 0.55
486210 Caniceps M M_Caniceps -0.41 -0.50 0.45
811945 Hyemalis F F_Hyemalis -0.35 -0.54 0.55
811947 Hyemalis F F_Hyemalis -0.35 -0.59 0.62
15661 Hyemalis M M_Hyemalis -0.34 -0.56 0.62
15662 Hyemalis M M_Hyemalis -0.35 -0.53 0.53
15663 Hyemalis M M_Hyemalis -0.33 -0.58 0.68
15664 Vulcani F F_Vulcani -0.29 -0.57 0.71
15665 Vulcani F F_Vulcani -0.29 -0.56 0.67
15666 Vulcani F F_Vulcani -0.28 -0.55 0.70
486218 Vulcani F F_Vulcani -0.36 -0.55 0.56
486224 Vulcani F F_Vulcani -0.36 -0.54 0.56
486212 Vulcani M M_Vulcani -0.37 -0.53 0.53
486213 Vulcani M M_Vulcani -0.37 -0.53 0.54
199479 Vulcani M M_Vulcani -0.33 -0.57 0.61
199483 Vulcani M M_Vulcani -0.33 -0.62 0.69
199484 Vulcani M M_Vulcani -0.33 -0.60 0.65
I'm trying to perform a bootstrap with boot()
to compute a statistic over variables "V1", "V2" and "V3", something like:
boot(my.dataset, statistic=lda (formula=lda(SEX~V1+V2+V3, data=my.dataset), R=3, sim = "ordinary")
But I need the resampling to take the same number of individuals depending on "Category" variable of "my.dataset". Any idea about how to do this?
formula
argument... – Netherlands